Fab Sidoli
Contributor
- Joined
- May 15, 2019
- Messages
- 114
Hi All,
I'm copying the title of a bug report submitted to FreeBSD, which hasn't got much traction there, in the hope that someone here can help.
I have three Dell PE R740xd2 servers with these BOSS cards in, all running FreeNAS 11.3, and can verify that the latest firmware for the BOSS breaks the ability for FreeNAS to boot, or for the FreeNAS installer to see the disks if a fresh install is attempted. A Windows installer can see the disks, however, so I feel like this is a FreeNAS issue.
Please see the bug report below (link and content paste provided) which describes the issue I'm having.
Apologies if the format of this post is poor form.
Many thanks,
Fab
FreeBSD bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243401
Contents:
"This feels more like a firmware problem than a driver problem but since it apparently works in Windows and Linux, but not in FreeBSD I figured I'd report it here anyway...
(Probably not meaningful to try to report it to Dell since FreeBSD isn't officially supported by them)
Dell BOSS-S1 (Marvell 88SE9230 based) M.2 "RAID" cards running Dells latest firmware (2.5.13.3022 A06 or 2.5.13.3022 A05) does something strange - when the kernel has loaded (from the drives on this card) it fails to detect the disks ("unconfigured" disks, non-RAID setup) and then root fs mounting fails...
(We have two M.2 SSDs connected to that controller)
With firmware 2.5.13.3016 A04 it gives a couple of errors at kernel boot time, but does detect the disks and the system boots.
With firmware 2.5.13.3011 A03, 2.5.13.2009 A02 or 2.5.13.2008 A01 no errors are printed and the disks are found just fine.
(But there are bugs fixed in the later releases that probably would be nice to have.. I have had M.2 drive go "offline" for me at 2008/A01-firmware so that's why I tried the later versions...
A summary of the (Dell) firmware fixes and my test results:
2.5.13.3022 A06
Fixes: None
Enhancement: Added support for 15G platforms
2.5.13.3020 A05
Status:
- Does not work, gives errors:
- 'ahcich16: stopping AHCI engine failed'
- Detects a 'pass23', but no disk devices:
pass23 at ahcich16 bus 0 scbus19 target 0 lun 0
pass23: <Marvell Console 1.01> Removable Processor SCSI device
pass23: Serial Number HKDP221516WL
pass23: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
Fixes:
- Fixed an issue where system will hang during
Boot when PERC is in HBA mode with BOSS-S1
- When CLI is running, default temporary file
directory & permission in Linux and ESXi Operating
systems are changed as appropriate
Enhancement: N/A
2.5.13.3016 A04
Status:
- Works and detects all disks, but gives errors about:
- 'ahcich14: stopping AHCI engine failed'
- 'ahcich15: stopping AHCI engine failed'
- 'ahcich16: stopping AHCI engine failed'
Fixes:
- Fixed a behavior of BOSS-S1 firmware incorrectly marking M.2 drive offline/failed
- Fixed a behavior where ESXi Host goes unresponsive
- Fixed a behavior where BOSS-S1 Management path will not respond to Management commands
- Fixed a behavior where BOSS-S1 boot partition becomes inaccessible
- Fixed a behavior where ESXi host results in PSOD due to unexpected I/O timeout
- Fixed a behavior where rebuild will not be proceed during error handling condition
Enhancement:
- Enhanced/ Added MVCLI events for command timeout
- Added SLES15 Support
2.5.13.3011 A03
Status:
- Works
Fixes:
- Fixed M.2 disk failure when medium error is present
Enhancement:
- Enhanced medium error handling
2.5.13.2009 A02
Status:
- Works
Fixes:
- Fixed Sideband functionality issue
Enhancement:
- Added support for Rollback of Controller Firmware through iDRAC/LC
2.5.13.2008 A01
Status:
- Works
Initial release
Kernel boot output (the relevant parts) from a firmware 3016 A04 boot:
ahci2: <Marvell 88SE9230 AHCI SATA controller> port 0x8028-0x802f,0x8034-0x8037,0x8020-0x8027,0x8030-0x8033,0x8\
000-0x801f mem 0xb8800000-0xb88007ff at device 0.0 numa-domain 0 on pci9
ahci2: AHCI v1.20 with 3 6Gbps ports, Port Multiplier not supported
ahci2: quirks=0x900<NOBSYRES,ALTSIG>
ahcich14: <AHCI channel> at channel 0 on ahci2
ahcich15: <AHCI channel> at channel 1 on ahci2
ahcich16: <AHCI channel> at channel 2 on ahci2
...
ahcich16: stopping AHCI engine failed
ahcich16: stopping AHCI engine failed
...
ahcich16: stopping AHCI engine failed
ahcich15: stopping AHCI engine failed
ada0 at ahcich14 bus 0 scbus17 target 0 lun 0
ada0: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada0: Serial Number PHDW817002Z4150A
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors)
ada1 at ahcich15 bus 0 scbus18 target 0 lun 0
ada1: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada1: Serial Number PHDW817002WC150A
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada1: Command Queueing enabled
ada1: 114473MB (234441648 512 byte sectors)
pass25 at ahcich16 bus 0 scbus19 target 0 lun 0
pass25: <Marvell Console 1.01> Removable Processor SCSI device
pass25: Serial Number HKDP221516WL
pass25: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
On 3022 the ada0 and ada1 devices never get detected, and it only complains about not being able to stop ahcich16, nothing about 14 & 15."
I'm copying the title of a bug report submitted to FreeBSD, which hasn't got much traction there, in the hope that someone here can help.
I have three Dell PE R740xd2 servers with these BOSS cards in, all running FreeNAS 11.3, and can verify that the latest firmware for the BOSS breaks the ability for FreeNAS to boot, or for the FreeNAS installer to see the disks if a fresh install is attempted. A Windows installer can see the disks, however, so I feel like this is a FreeNAS issue.
Please see the bug report below (link and content paste provided) which describes the issue I'm having.
Apologies if the format of this post is poor form.
Many thanks,
Fab
FreeBSD bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=243401
Contents:
"This feels more like a firmware problem than a driver problem but since it apparently works in Windows and Linux, but not in FreeBSD I figured I'd report it here anyway...
(Probably not meaningful to try to report it to Dell since FreeBSD isn't officially supported by them)
Dell BOSS-S1 (Marvell 88SE9230 based) M.2 "RAID" cards running Dells latest firmware (2.5.13.3022 A06 or 2.5.13.3022 A05) does something strange - when the kernel has loaded (from the drives on this card) it fails to detect the disks ("unconfigured" disks, non-RAID setup) and then root fs mounting fails...
(We have two M.2 SSDs connected to that controller)
With firmware 2.5.13.3016 A04 it gives a couple of errors at kernel boot time, but does detect the disks and the system boots.
With firmware 2.5.13.3011 A03, 2.5.13.2009 A02 or 2.5.13.2008 A01 no errors are printed and the disks are found just fine.
(But there are bugs fixed in the later releases that probably would be nice to have.. I have had M.2 drive go "offline" for me at 2008/A01-firmware so that's why I tried the later versions...
A summary of the (Dell) firmware fixes and my test results:
2.5.13.3022 A06
Fixes: None
Enhancement: Added support for 15G platforms
2.5.13.3020 A05
Status:
- Does not work, gives errors:
- 'ahcich16: stopping AHCI engine failed'
- Detects a 'pass23', but no disk devices:
pass23 at ahcich16 bus 0 scbus19 target 0 lun 0
pass23: <Marvell Console 1.01> Removable Processor SCSI device
pass23: Serial Number HKDP221516WL
pass23: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
Fixes:
- Fixed an issue where system will hang during
Boot when PERC is in HBA mode with BOSS-S1
- When CLI is running, default temporary file
directory & permission in Linux and ESXi Operating
systems are changed as appropriate
Enhancement: N/A
2.5.13.3016 A04
Status:
- Works and detects all disks, but gives errors about:
- 'ahcich14: stopping AHCI engine failed'
- 'ahcich15: stopping AHCI engine failed'
- 'ahcich16: stopping AHCI engine failed'
Fixes:
- Fixed a behavior of BOSS-S1 firmware incorrectly marking M.2 drive offline/failed
- Fixed a behavior where ESXi Host goes unresponsive
- Fixed a behavior where BOSS-S1 Management path will not respond to Management commands
- Fixed a behavior where BOSS-S1 boot partition becomes inaccessible
- Fixed a behavior where ESXi host results in PSOD due to unexpected I/O timeout
- Fixed a behavior where rebuild will not be proceed during error handling condition
Enhancement:
- Enhanced/ Added MVCLI events for command timeout
- Added SLES15 Support
2.5.13.3011 A03
Status:
- Works
Fixes:
- Fixed M.2 disk failure when medium error is present
Enhancement:
- Enhanced medium error handling
2.5.13.2009 A02
Status:
- Works
Fixes:
- Fixed Sideband functionality issue
Enhancement:
- Added support for Rollback of Controller Firmware through iDRAC/LC
2.5.13.2008 A01
Status:
- Works
Initial release
Kernel boot output (the relevant parts) from a firmware 3016 A04 boot:
ahci2: <Marvell 88SE9230 AHCI SATA controller> port 0x8028-0x802f,0x8034-0x8037,0x8020-0x8027,0x8030-0x8033,0x8\
000-0x801f mem 0xb8800000-0xb88007ff at device 0.0 numa-domain 0 on pci9
ahci2: AHCI v1.20 with 3 6Gbps ports, Port Multiplier not supported
ahci2: quirks=0x900<NOBSYRES,ALTSIG>
ahcich14: <AHCI channel> at channel 0 on ahci2
ahcich15: <AHCI channel> at channel 1 on ahci2
ahcich16: <AHCI channel> at channel 2 on ahci2
...
ahcich16: stopping AHCI engine failed
ahcich16: stopping AHCI engine failed
...
ahcich16: stopping AHCI engine failed
ahcich15: stopping AHCI engine failed
ada0 at ahcich14 bus 0 scbus17 target 0 lun 0
ada0: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada0: Serial Number PHDW817002Z4150A
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada0: Command Queueing enabled
ada0: 114473MB (234441648 512 byte sectors)
ada1 at ahcich15 bus 0 scbus18 target 0 lun 0
ada1: <SSDSCKJB120G7R N201DL43> ACS-3 ATA SATA 3.x device
ada1: Serial Number PHDW817002WC150A
ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 512bytes)
ada1: Command Queueing enabled
ada1: 114473MB (234441648 512 byte sectors)
pass25 at ahcich16 bus 0 scbus19 target 0 lun 0
pass25: <Marvell Console 1.01> Removable Processor SCSI device
pass25: Serial Number HKDP221516WL
pass25: 150.000MB/s transfers (SATA 1.x, UDMA4, ATAPI 12bytes, PIO 8192bytes)
On 3022 the ada0 and ada1 devices never get detected, and it only complains about not being able to stop ahcich16, nothing about 14 & 15."