11.1-U5 smartd Issues

Joined
Dec 29, 2014
Messages
1,135
I have 2 FreeNAS servers that are both running 11.1-U5. All key features of FreeNAS work just fine. Both hosts are Cisco UCS C240-M3S servers. Primary has dual E5-2660 v2 and 128GB ECC RAM, and the secondary has dual E5-2620 with 64GB ECC RAM. Both servers have 24 internal 2.5 SAS/SATA bays, and dual internal SD card slots. Internal drives are controlled by LSI 9271 in JBOD mode, and each server also has an LSI 9207-8E for external connections. There are 4 x Intel 350 NIC's on the motherboard of which I am using two in an LACP LAG connecting to an HP 2848 switch. Each server has a Chelsio T520-CR 10G NIC facing the storage network in an LACP LAG connecting to an HP/H3C 10G switch. The problem is that /usr/local/etc/smartd.conf being generated by /etc/ix.rc.d/ix-smartd is not finding the devices on the internal controller. I think it used to do that, but I honestly can't remember when it stopped.

camcontrol devlist -v from primary
Code:
scbus0 on mps0 bus 0:
<>								 at scbus0 target -1 lun ffffffff ()
scbus1 on mrsas0 bus 0:
<>								 at scbus1 target -1 lun ffffffff ()
scbus2 on mrsas0 bus 1:
<CISCO UCS 240 0809>			   at scbus2 target 26 lun 0 (ses0,pass0)
<ATA ST91000640NS CC03>			at scbus2 target 27 lun 0 (pass1,da0)
<ATA ST91000640NS CC03>			at scbus2 target 28 lun 0 (pass2,da1)
<ATA ST91000640NS CC03>			at scbus2 target 29 lun 0 (pass3,da2)
<ATA ST91000640NS CC03>			at scbus2 target 30 lun 0 (pass4,da3)
<ATA ST91000640NS CC03>			at scbus2 target 31 lun 0 (pass5,da4)
<ATA ST91000640NS CC03>			at scbus2 target 32 lun 0 (pass6,da5)
<ATA ST91000640NS CC03>			at scbus2 target 33 lun 0 (pass7,da6)
<ATA ST91000640NS CC03>			at scbus2 target 34 lun 0 (pass8,da7)
<ATA ST91000640NS CC02>			at scbus2 target 46 lun 0 (pass9,da8)
<ATA ST91000640NS CC02>			at scbus2 target 47 lun 0 (pass10,da9)
<ATA ST91000640NS CC03>			at scbus2 target 48 lun 0 (pass11,da10)
<ATA ST91000640NS BK03>			at scbus2 target 49 lun 0 (pass12,da11)
<ATA ST91000640NS BK03>			at scbus2 target 50 lun 0 (pass13,da12)
<ATA ST91000640NS BK03>			at scbus2 target 51 lun 0 (pass14,da13)
<ATA ST91000640NS BK03>			at scbus2 target 52 lun 0 (pass15,da14)
<ATA ST91000640NS BK03>			at scbus2 target 53 lun 0 (pass16,da15)
<ATA ST91000640NS BK03>			at scbus2 target 54 lun 0 (pass17,da16)
<TOSHIBA MBF2300RC 5704>		   at scbus2 target 55 lun 0 (pass18,da17)
<TOSHIBA AL13SEB300 5706>		  at scbus2 target 56 lun 0 (pass19,da18)
<>								 at scbus2 target -1 lun ffffffff ()
scbus3 on camsim0 bus 0:
<>								 at scbus3 target -1 lun ffffffff ()
scbus4 on umass-sim0 bus 0:
<HV Hypervisor_0 1.01>			 at scbus4 target 0 lun 0 (pass20,da19)
scbus-1 on xpt0 bus 0:
<>								 at scbus-1 target -1 lun ffffffff (xpt0)


camcontrol devlist -v from secondary
Code:
scbus0 on mps0 bus 0:
<HP DG0300BALVP HPD4>			  at scbus0 target 8 lun 0 (pass0,da0)
<HP DG0300BALVP HPD4>			  at scbus0 target 9 lun 0 (pass1,da1)
<HP DG0300BALVP HPD4>			  at scbus0 target 10 lun 0 (pass2,da2)
<HP EG0300FAWHV HPDF>			  at scbus0 target 11 lun 0 (pass3,da3)
<HP DG0300BALVP HPD4>			  at scbus0 target 12 lun 0 (pass4,da4)
<HP DG0300FAMWN HPDF>			  at scbus0 target 13 lun 0 (pass5,da5)
<HP DG0300BALVP HPD4>			  at scbus0 target 14 lun 0 (pass6,da6)
<HP DG0300BALVP HPD4>			  at scbus0 target 15 lun 0 (pass7,da7)
<HP DG0300BALVP HPD4>			  at scbus0 target 16 lun 0 (pass8,da8)
<HP DG0300BALVP HPD4>			  at scbus0 target 17 lun 0 (pass9,da9)
<HP EG0300FAWHV HPDE>			  at scbus0 target 18 lun 0 (pass10,da10)
<HP EG0300FAWHV HPDE>			  at scbus0 target 19 lun 0 (pass11,da11)
<HP DG0300BALVP HPD4>			  at scbus0 target 20 lun 0 (pass12,da12)
<HP DG0300BALVP HPD4>			  at scbus0 target 21 lun 0 (pass13,da13)
<HP EG0300FBDSP HPD6>			  at scbus0 target 22 lun 0 (pass14,da14)
<HP DG0300BALVP HPD4>			  at scbus0 target 23 lun 0 (pass15,da15)
<HP DG0300BALVP HPD4>			  at scbus0 target 24 lun 0 (pass16,da16)
<HP EG0300FAWHV HPDE>			  at scbus0 target 25 lun 0 (pass17,da17)
<HP DG0300BALVP HPD4>			  at scbus0 target 26 lun 0 (pass18,da18)
<HP DG0300BALVP HPD4>			  at scbus0 target 27 lun 0 (pass19,da19)
<HP DG0300FAMWN HPDF>			  at scbus0 target 28 lun 0 (pass20,da20)
<HP EG0300FBLSE HPD8>			  at scbus0 target 29 lun 0 (pass21,da21)
<HP DG0300BALVP HPD4>			  at scbus0 target 30 lun 0 (pass22,da22)
<HP EG0300FAWHV HPDF>			  at scbus0 target 31 lun 0 (pass23,da23)
<HP DG0300BALVP HPD4>			  at scbus0 target 32 lun 0 (pass24,da24)
<HP D2700 SAS AJ941A 0149>		 at scbus0 target 33 lun 0 (ses0,pass25)
<HP DG0300BALVP HPD4>			  at scbus0 target 34 lun 0 (pass26,da25)
<HP DG0300BALVP HPD4>			  at scbus0 target 35 lun 0 (pass27,da26)
<HP DG0300BALVP HPD4>			  at scbus0 target 36 lun 0 (pass28,da27)
<HP EG0300FAWHV HPDF>			  at scbus0 target 37 lun 0 (pass29,da28)
<HP DG0300BALVP HPD4>			  at scbus0 target 38 lun 0 (pass30,da29)
<HP DG0300FAMWN HPDF>			  at scbus0 target 39 lun 0 (pass31,da30)
<HP DG0300BALVP HPD4>			  at scbus0 target 40 lun 0 (pass32,da31)
<HP DG0300BALVP HPD4>			  at scbus0 target 41 lun 0 (pass33,da32)
<HP DG0300BALVP HPD4>			  at scbus0 target 42 lun 0 (pass34,da33)
<HP DG0300BALVP HPD4>			  at scbus0 target 43 lun 0 (pass35,da34)
<HP EG0300FAWHV HPDE>			  at scbus0 target 44 lun 0 (pass36,da35)
<HP EG0300FAWHV HPDE>			  at scbus0 target 45 lun 0 (pass37,da36)
<HP DG0300BALVP HPD4>			  at scbus0 target 46 lun 0 (pass38,da37)
<HP DG0300BALVP HPD4>			  at scbus0 target 47 lun 0 (pass39,da38)
<HP EG0300FBDSP HPD6>			  at scbus0 target 48 lun 0 (pass40,da39)
<HP DG0300BALVP HPD4>			  at scbus0 target 49 lun 0 (pass41,da40)
<HP DG0300BALVP HPD4>			  at scbus0 target 50 lun 0 (pass42,da41)
<HP EG0300FAWHV HPDE>			  at scbus0 target 51 lun 0 (pass43,da42)
<HP DG0300BALVP HPD4>			  at scbus0 target 52 lun 0 (pass44,da43)
<HP DG0300BALVP HPD4>			  at scbus0 target 53 lun 0 (pass45,da44)
<HP DG0300FAMWN HPDF>			  at scbus0 target 54 lun 0 (pass46,da45)
<HP EG0300FBLSE HPD8>			  at scbus0 target 55 lun 0 (pass47,da46)
<HP DG0300BALVP HPD4>			  at scbus0 target 56 lun 0 (pass48,da47)
<HP EG0300FAWHV HPDF>			  at scbus0 target 57 lun 0 (pass49,da48)
<HP DG0300BALVP HPD4>			  at scbus0 target 58 lun 0 (pass50,da49)
<HP D2700 SAS AJ941A 0149>		 at scbus0 target 59 lun 0 (ses1,pass51)
<>								 at scbus0 target -1 lun ffffffff ()
scbus1 on mrsas0 bus 0:
<>								 at scbus1 target -1 lun ffffffff ()
scbus2 on mrsas0 bus 1:
<CISCO UCS 240 0809>			   at scbus2 target 8 lun 0 (ses2,pass52)
<SEAGATE ST300MM0006 A005>		 at scbus2 target 11 lun 0 (pass53,da50)
<SEAGATE ST300MM0006 A005>		 at scbus2 target 12 lun 0 (pass54,da51)
<HITACHI HUSSL4010BSS600 A110>	 at scbus2 target 49 lun 0 (pass55,da52)
<HITACHI HUSSL4010BSS600 A110>	 at scbus2 target 50 lun 0 (pass56,da53)
<>								 at scbus2 target -1 lun ffffffff ()
scbus3 on camsim0 bus 0:
<>								 at scbus3 target -1 lun ffffffff ()
scbus4 on umass-sim0 bus 0:
<HV Hypervisor_0 1.01>			 at scbus4 target 0 lun 0 (pass57,da54)
scbus-1 on xpt0 bus 0:
<>								 at scbus-1 target -1 lun ffffffff (xpt0)


Smartd does start on the secondary because it does find the devices in the external HP D2700 connected via the 9207-8E. I don't frequently boot these, so I can hack smartd.conf to get it going. That is less than ideal though. I don't know if this is related, but I have noticed that I now get the following message during system boot on both systems.

Code:
random: unblocking device.
run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config
run_interrupt_driven_hooks: still waiting after 120 seconds for xpt_config
mrsas0: Initiating Target RESET because of SCSI IO timeout!
mrsas0: Task management NOT SUPPORTED for CAM target:8
mrsas0: target reset FAIL!!
mrsas0: Initiaiting OCR because of TM FAILURE!
run_interrupt_driven_hooks: still waiting after 180 seconds for xpt_config


My issue sounds like it might be related to this one: https://redmine.ixsystems.com/issues/35998

The cards in question are RAID cards that use the mrsas driver. The drives themselves are defined as JBOD, and FreeNAS is able to use them without any issue. Smartd just seems to hate them. :-(
 
Last edited:
Joined
Dec 29, 2014
Messages
1,135
I know it is bad form to follow up yourself, but I was surprised to not get any response to this. Should I assume this issue is being addressed? I have seen some other posts discussing issues with smartd on JBOD drives in LSI mrsas controllers. Is that expected in 11.1-U6 or 11.2?
 
D

dlavigne

Guest
Not that I'm aware of. It is probably worth reporting at bugs.freenas.org.
 
Joined
Dec 29, 2014
Messages
1,135
Joined
Dec 29, 2014
Messages
1,135
Apparently mine is a duplicate of this. https://redmine.ixsystems.com/issues/35998

It does appear that the moment you say the word "RAID", IX loses interest even if the card is in HBA mode. :-( I wonder what happens if you say the word "RAID" three times??
 
Joined
Dec 29, 2014
Messages
1,135
I removed the LSI 9271 RAID cards, and replaced them with LSI 9207-8i controllers. Now all seem to be happy.
 

short-stack

Explorer
Joined
Feb 28, 2017
Messages
80
I know this is several months old, but I just migrated my FreeNAS to newer hardware and I am seeing this same issue. I'm running on a HP Proliant DL380P with a p420i controller in HBA mode. FreeNAS sees all of the disks fine, and sees they are smart available/enabled but the conf remains empty.

I don't really want to replace the controller card, so I guess I'll just have to keep a manual copy of what should be in the conf and put the entries in there by hand after a reboot?
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I know this is several months old, but I just migrated my FreeNAS to newer hardware and I am seeing this same issue. I'm running on a HP Proliant DL380P with a p420i controller in HBA mode. FreeNAS sees all of the disks fine, and sees they are smart available/enabled but the conf remains empty.

I don't really want to replace the controller card, so I guess I'll just have to keep a manual copy of what should be in the conf and put the entries in there by hand after a reboot?
You do not use a raid card with zfs for best results. You will probably lose the pool if you keep using it. Good luck
 
Joined
Dec 29, 2014
Messages
1,135
Top