RAID controllers HBA mode

Diff

Dabbler
Joined
May 9, 2020
Messages
33
It feels like there is a lot of different, sometimes confusing information on different RAID controllers suitable for FreeNAS/TrueNAS and ZFS in general.

Following official documentation - https://www.ixsystems.com/blog/hardware-guide/ it states that RAID controllers switched to HBA mode are fine to use, but I have seen different forums post where it's being stated that not all works as expected.

With that, I would like to ask experts in this area:
Q1: What are tools / checks possible to do to verify if current RAID card switched to HBA is suitable for ZFS?

For example, I have Dell PERC H730P mini in one of servers in my HomeLab, with more recent Firmware it has now option to switch to HBA mode (firmware 25.5.8.0001)

rpviewer-13.png


More details about controller

perc_h730p_hba.png


Just booting into Linux (Debian) and trying to access drive and S.M.A.R.T. functions it seems like it able to see disk directly and get smart info

Code:
smartctl --all /dev/sdd
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-8-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST1200MM0099
Revision:             ST34
Compliance:           SPC-4
User Capacity:        1,200,243,695,616 bytes [1.20 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
8 bytes of protection information per logical block
LU is fully provisioned
Rotation Rate:        10000 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000c500cbc4271f
Serial number:        WFK9G8DD
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Thu Sep 16 13:18:02 2021 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     29 C
Drive Trip Temperature:        60 C

Accumulated power on time, hours:minutes 173:58
Manufactured in week 09 of year 2021
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  58
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  219
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 733030815
  Blocks received from initiator = 2880043772
  Blocks read from cache and sent to initiator = 5963364
  Number of read and write commands whose size <= segment size = 361639
  Number of read and write commands whose size > segment size = 4

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 173.97
  number of minutes until next internal SMART test = 56

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   723387202        0         0  723387202          0        376.180           0
write:         0        0         0         0          0       1497.908           0
verify:  9567921        0         0   9567921          0          4.996           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                  96      39                 - [-   -    -]
# 2  Background short  Completed                  96      39                 - [-   -    -]
# 3  Reserved(7)       Completed                  64      37                 - [-   -    -]
# 4  Background short  Completed                  96      35                 - [-   -    -]
# 5  Reserved(7)       Completed                  64      10                 - [-   -    -]
# 6  Background short  Completed                  96       9                 - [-   -    -]

Long (extended) Self-test duration: 6723 seconds [112.0 minutes]


Q2: What else I can check?

Q3: What tools I can use to be more certain if specific controller with specific firmware being suitable for ZFS?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi,

This post by @jgreco is the reference about this.

So :
can you see the exact drive models once you configured your adapter to HBA with a command like camcontrol devlist ?

Can you do smart tests and see the detailed results ?

If you can do both, its really suggest that your TrueNAS does have physical control of the disk. Still, it is not impossible that the controller keeps messing things up, but you are in a better situation if these commands work.

It just emphasis that backups are essentials even with a proper HBA. No single server can be more than a single point of failure. With such physical control of your drives + good, complete and tested backups, you should be good.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
Assuming the firmware actually has the right stuff to get out of the way and "just be an HBA" then the real concern is the driver in use. The mps and mpr FreeBSD drivers have billions of run-hours behind them to back up their stability and the fact that they don't cave in under a heavy ZFS workload. The mrsas driver is less well-tested but is getting considered "mostly okay" now in 2021 - I believe that's what the H730(P) in HBA mode uses.

What you want to avoid is any "HBA mode" that ends up still loading the mfi or ciss RAID drivers; the latter especially is used by the HP SmartArray P-series cards.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,176
Best approach: Buy an HBA330, remove the H730P, install the HBA330, sell the H730P, pocket some change. HBA330 minis go for ~100 bucks and you keep all the moderately-useful iDRAC integration.

Sidenote: I hope you didn't pay Dell prices for the BOSS card.
 

Diff

Dabbler
Joined
May 9, 2020
Messages
33
Thank you @Heracles !

This post by @jgreco is the reference about this.
This is really helpful. Missed that one, bookmarking it!

can you see the exact drive models once you configured your adapter to HBA with a command like camcontrol devlist ?

Can you do smart tests and see the detailed results ?

If you can do both, its really suggest that your TrueNAS does have physical control of the disk. Still, it is not impossible that the controller keeps messing things up, but you are in a better situation if these commands work.
seems like `camcontrol` is not part of Debian, trying to figure out how to get this tool on Debian or is there analogs for Linux, not FreeBSD based OSes?

Just kicked off test, will check results in 2 hours or so

Code:
smartctl --test=long /dev/sdd
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.0-8-amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

Extended Background Self Test has begun
Please wait 112 minutes for test to complete.
Estimated completion time: Thu Sep 16 19:50:31 2021 PDT
Use smartctl -X to abort test


It just emphasis that backups are essentials even with a proper HBA. No single server can be more than a single point of failure. With such physical control of your drives + good, complete and tested backups, you should be good.
Totally make sense!
 

Diff

Dabbler
Joined
May 9, 2020
Messages
33
Best approach: Buy an HBA330, remove the H730P, install the HBA330, sell the H730P, pocket some change. HBA330 minis go for ~100 bucks and you keep all the moderately-useful iDRAC integration.
I am still thinking about that to be honest. Still not clear if HBA330 works out of the box or still needs to be flashed into IT mode?
Also, for some reason [Dell spec] states it only supports 8 drives, and R7515 chassis I have has 12 bays, does it mean it will limit me to 8 drives only?
And last concern, currently R7515 I just got, still under warranty (got refurbished for reasonable price), changing new H730P to HBA330 from eBay could be "taking chances..", not big concerns, but still

Sidenote: I hope you didn't pay Dell prices for the BOSS card.
Side answer :wink:: Got that one used, about $70 + 2 x Micron M600 240GB M.2 SSD to use for BOOT mirror.
 

Diff

Dabbler
Joined
May 9, 2020
Messages
33
Assuming the firmware actually has the right stuff to get out of the way and "just be an HBA" then the real concern is the driver in use. The mps and mpr FreeBSD drivers have billions of run-hours behind them to back up their stability and the fact that they don't cave in under a heavy ZFS workload. The mrsas driver is less well-tested but is getting considered "mostly okay" now in 2021 - I believe that's what the H730(P) in HBA mode uses.

What you want to avoid is any "HBA mode" that ends up still loading the mfi or ciss RAID drivers; the latter especially is used by the HP SmartArray P-series cards.
That's very helpful details! Thank you @HoneyBadger!

Are there same things I can look out for on TrueNAS Scale (Debian based)?
 

Diff

Dabbler
Joined
May 9, 2020
Messages
33
BTW, decided to try load TrueNAS Core 12.U5.1, at least on install it sees all disks (hmm except 2 x M600 SSD from BOSS card)

rpviewer-14.png


as well as
Code:
camcontrol devlist
returns
rpviewer-15.png


during installer boot it reports as mrsas if I read this correctly
rpviewer-16.png


not sure if that a good sign or not, will report what I see after install

UPDATE:

Installed TrueNAS Core 12.0-U5.1, it dmesg -a reports:

Code:
ses0: da0,pass0 in 'Drive Slot 0', SAS Slot: 2 phys at slot 0
ses0:  phy 0: SATA device
ses0:  phy 0: parent 500056b35663a5ff addr 500056b35663a5c0
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
ses0: da1,pass1 in 'Drive Slot 3', SAS Slot: 2 phys at slot 3
ses0:  phy 0: SATA device
ses0:  phy 0: parent 500056b35663a5ff addr 500056b35663a5c3
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
ses0: da2,pass2 in 'Drive Slot 6', SAS Slot: 2 phys at slot 6
ses0:  phy 0: SATA device
ses0:  phy 0: parent 500056b35663a5ff addr 500056b35663a5c6
da4 at umass-sim0 bus 0 scbus15 target 0 lun 0
da4: <IT1167B USB Flash Disk 0.00> Removable Direct Access SCSI-2 device
da4: Serial Number 0000000000000419
da4: 40.000MB/s transfers
da4: 30952MB (63389696 512 byte sectors)
da4: quirks=0x2<NO_6_BYTE>
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
ses0: da3,pass3 in 'Drive Slot 9', SAS Slot: 2 phys at slot 9
ses0:  phy 0: SAS device type 1 phy 0 Target ( SSP )
ses0:  phy 0: parent 500056b35663a5ff addr 5000c500cbc4271d
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
da3 at mrsas0 bus 1 scbus1 target 9 lun 0
da3: <SEAGATE ST1200MM0099 ST34> Fixed Direct Access SPC-4 SCSI device
da3: Serial Number WFK9G8DD
da3: 150.000MB/s transfers
da3: 1144641MB (2344225968 512 byte sectors, DIF type 2)
da0 at mrsas0 bus 1 scbus1 target 0 lun 0
da0: <ATA ST18000NM000J-2T SN01> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number ZR5239E2
da0: 150.000MB/s transfers
da0: 17166336MB (35156656128 512 byte sectors)
da1 at mrsas0 bus 1 scbus1 target 3 lun 0
da1: <ATA ST18000NM000J-2T SN01> Fixed Direct Access SPC-4 SCSI device
da1: Serial Number ZR52F49C
da1: 150.000MB/s transfers
da1: 17166336MB (35156656128 512 byte sectors)
da2 at mrsas0 bus 1 scbus1 target 6 lun 0
da2: <ATA ST18000NM000J-2T SN01> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number ZR51Y5S6
da2: 150.000MB/s transfers
da2: 17166336MB (35156656128 512 byte sectors)


Does this mean it uses mrsas driver?

Able to see SMART info

Code:
root@truenas[~]# smartctl -a /dev/da3
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST1200MM0099
Revision:             ST34
Compliance:           SPC-4
User Capacity:        1,200,243,695,616 bytes [1.20 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
8 bytes of protection information per logical block
LU is fully provisioned
Rotation Rate:        10000 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000c500cbc4271f
Serial number:        WFK9G8DD
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Thu Sep 16 21:33:35 2021 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     29 C
Drive Trip Temperature:        60 C

Accumulated power on time, hours:minutes 182:14
Manufactured in week 09 of year 2021
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  58
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  219
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 733121568
  Blocks received from initiator = 2880043812
  Blocks read from cache and sent to initiator = 5980132
  Number of read and write commands whose size <= segment size = 361695
  Number of read and write commands whose size > segment size = 4

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 182.23
  number of minutes until next internal SMART test = 2

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   723474815        0         0  723474815          0        376.227           0
write:         0        0         0         0          0       1497.922           0
verify:  9567921        0         0   9567921          0          4.996           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                  96     180                 - [-   -    -]
# 2  Background short  Completed                  96      39                 - [-   -    -]
# 3  Background short  Completed                  96      39                 - [-   -    -]
# 4  Reserved(7)       Completed                  64      37                 - [-   -    -]
# 5  Background short  Completed                  96      35                 - [-   -    -]
# 6  Reserved(7)       Completed                  64      10                 - [-   -    -]
# 7  Background short  Completed                  96       9                 - [-   -    -]

Long (extended) Self-test duration: 6723 seconds [112.0 minutes]


TrueNAS see all SMART info in Web UI Storage > Disks
truenas-da3.png
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,176
I am still thinking about that to be honest. Still not clear if HBA330 works out of the box or still needs to be flashed into IT mode?
It's IT mode only.

Also, for some reason [Dell spec] states it only supports 8 drives, and R7515 chassis I have has 12 bays, does it mean it will limit me to 8 drives only?
And last concern, currently R7515 I just got, still under warranty (got refurbished for reasonable price), changing new H730P to HBA330 from eBay could be "taking chances..", not big concerns, but still
It'll work with many more disks (128?), but it needs an expander backplane. Same goes for the H730P, though. If you have a backplane that supports 12 SAS/SATA disks, you should be fine. However, some bays may be NVMe-only. E.g. The R6515 with the 10-disk backplane only takes up to 8 SAS/SATA disks, to cut out the expensive expander. The last two bays are NVMe-only.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,176
BTW, decided to try load TrueNAS Core 12.U5.1, at least on install it sees all disks (hmm except 2 x M600 SSD from BOSS card)

View attachment 49438

as well as
Code:
camcontrol devlist
returns
View attachment 49440

during installer boot it reports as mrsas if I read this correctly
View attachment 49442

not sure if that a good sign or not, will report what I see after install

UPDATE:

Installed TrueNAS Core 12.0-U5.1, it dmesg -a reports:

Code:
ses0: da0,pass0 in 'Drive Slot 0', SAS Slot: 2 phys at slot 0
ses0:  phy 0: SATA device
ses0:  phy 0: parent 500056b35663a5ff addr 500056b35663a5c0
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
ses0: da1,pass1 in 'Drive Slot 3', SAS Slot: 2 phys at slot 3
ses0:  phy 0: SATA device
ses0:  phy 0: parent 500056b35663a5ff addr 500056b35663a5c3
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
ses0: da2,pass2 in 'Drive Slot 6', SAS Slot: 2 phys at slot 6
ses0:  phy 0: SATA device
ses0:  phy 0: parent 500056b35663a5ff addr 500056b35663a5c6
da4 at umass-sim0 bus 0 scbus15 target 0 lun 0
da4: <IT1167B USB Flash Disk 0.00> Removable Direct Access SCSI-2 device
da4: Serial Number 0000000000000419
da4: 40.000MB/s transfers
da4: 30952MB (63389696 512 byte sectors)
da4: quirks=0x2<NO_6_BYTE>
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
ses0: da3,pass3 in 'Drive Slot 9', SAS Slot: 2 phys at slot 9
ses0:  phy 0: SAS device type 1 phy 0 Target ( SSP )
ses0:  phy 0: parent 500056b35663a5ff addr 5000c500cbc4271d
ses0:  phy 1: SAS device type 0 phy 0
ses0:  phy 1: parent 0 addr 0
da3 at mrsas0 bus 1 scbus1 target 9 lun 0
da3: <SEAGATE ST1200MM0099 ST34> Fixed Direct Access SPC-4 SCSI device
da3: Serial Number WFK9G8DD
da3: 150.000MB/s transfers
da3: 1144641MB (2344225968 512 byte sectors, DIF type 2)
da0 at mrsas0 bus 1 scbus1 target 0 lun 0
da0: <ATA ST18000NM000J-2T SN01> Fixed Direct Access SPC-4 SCSI device
da0: Serial Number ZR5239E2
da0: 150.000MB/s transfers
da0: 17166336MB (35156656128 512 byte sectors)
da1 at mrsas0 bus 1 scbus1 target 3 lun 0
da1: <ATA ST18000NM000J-2T SN01> Fixed Direct Access SPC-4 SCSI device
da1: Serial Number ZR52F49C
da1: 150.000MB/s transfers
da1: 17166336MB (35156656128 512 byte sectors)
da2 at mrsas0 bus 1 scbus1 target 6 lun 0
da2: <ATA ST18000NM000J-2T SN01> Fixed Direct Access SPC-4 SCSI device
da2: Serial Number ZR51Y5S6
da2: 150.000MB/s transfers
da2: 17166336MB (35156656128 512 byte sectors)


Does this mean it uses mrsas driver?

Able to see SMART info

Code:
root@truenas[~]# smartctl -a /dev/da3
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST1200MM0099
Revision:             ST34
Compliance:           SPC-4
User Capacity:        1,200,243,695,616 bytes [1.20 TB]
Logical block size:   512 bytes
Formatted with type 2 protection
8 bytes of protection information per logical block
LU is fully provisioned
Rotation Rate:        10000 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000c500cbc4271f
Serial number:        WFK9G8DD
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Thu Sep 16 21:33:35 2021 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     29 C
Drive Trip Temperature:        60 C

Accumulated power on time, hours:minutes 182:14
Manufactured in week 09 of year 2021
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  58
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  219
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 733121568
  Blocks received from initiator = 2880043812
  Blocks read from cache and sent to initiator = 5980132
  Number of read and write commands whose size <= segment size = 361695
  Number of read and write commands whose size > segment size = 4

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 182.23
  number of minutes until next internal SMART test = 2

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   723474815        0         0  723474815          0        376.227           0
write:         0        0         0         0          0       1497.922           0
verify:  9567921        0         0   9567921          0          4.996           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background long   Completed                  96     180                 - [-   -    -]
# 2  Background short  Completed                  96      39                 - [-   -    -]
# 3  Background short  Completed                  96      39                 - [-   -    -]
# 4  Reserved(7)       Completed                  64      37                 - [-   -    -]
# 5  Background short  Completed                  96      35                 - [-   -    -]
# 6  Reserved(7)       Completed                  64      10                 - [-   -    -]
# 7  Background short  Completed                  96       9                 - [-   -    -]

Long (extended) Self-test duration: 6723 seconds [112.0 minutes]


TrueNAS see all SMART info in Web UI Storage > Disks
View attachment 49443
Looks like it's using mrsas, which should be nominally OK. It's hard to know for sure, though.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
I suspect that card is old enough that it doesn't have the enhanced JBOD functionality. Basically it has some ram, and a battery. In the newer cards they change the JBOD disks to allow the controller to cache the writes, and it's that little trick that wrecks ZFS' awareness of the state of the I/O operation. Check to see if you can switch each disk from "write-back" to "write-thru" or something like that.
 
Top