6 of 11 drives resilvering and then crashing with reboot

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
I ordered a LSI SAS Contrroller with an IBM Expander. I read that I could just hook my drives up to them and FreeNAS would see them with no issue. Once I booted up 4 of the 11 drives began resilvering. This went to about 2% and then crashed the computer. I removed the drives from the new controller/expander an put the drives back to the motherboard SATA ports and the Asmedia controller. The thing is I don't remember the exact order that each specific drive went to each specific SATA port (I was under the impression it wasn't supposed to make a difference). Now when booting up- 6 of the 11 drives are resilvering. It crashed arund 1% and that's when I kept it turned off and now I'm writing this post. I just ordered a SuperMicro server motherboard, ECC RAM, and Intel Xeon CPU. I didn't have time to record any info, but I did take the attached picture. Hardware info should be in my signature. I was running 11.2-U3, but that started crashing when I started doing transfers from macOS (for no reason I can explain). I rolled back to 11.1-U5 and it was running stable until I added the SAS/Expander. Please help.
 

Attachments

  • A5BE809C-6DDC-4437-8C26-A1BF7E8AA028.jpeg
    A5BE809C-6DDC-4437-8C26-A1BF7E8AA028.jpeg
    306.7 KB · Views: 382

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
When I installed 11.1-U5 - I did it from scratch and did not upload a configuration file. I never got around to adding back any tunables. You can see I joined in Dec 2013. I have been running FreeNAS stable/successfully for more than 5 years on this same hardware. What the heck is going on?
 
Last edited:

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Has the SAS controller firmware been flashed to IT mode?
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
Yes, the SAS controller was flashed to latest BIOS/firmware before I did anything.
 

2nd-in-charge

Explorer
Joined
Jan 10, 2017
Messages
94
Yes, the SAS controller was flashed to latest BIOS/firmware before I did anything.
What controller and what firmware?
I would disconnect the drives and run memtest. Note that all resilvering drives have checksum errors.
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
LSI SAS 9207-8i SATA/SAS 6Gb/s PCI-E 3.0 Host Bus Adapter IT Mode SAS9207-8i US and
IBM 46M0997 ServeRAID Expansion Adapter 16-Port SAS Expander
Latest firmware from online- it was version 20 from 2015/2016.
Running a memory test is the one thing I haven't tested and I can, but my new (used) SuperMicro Server motherboard is being shipped now. I already have ECC memory and two Intel Xeon CPUs waiting for it. I'm thinking I should just wait for the motherboard to arrive. I feel that my particular setup wants the drives in a certain order and that involves connecting them to each port in the right order. I say this because I have changed HDD's in the past with this setup. I used to have 4 TB drives and I upgraded them all to 8 TB. I remember having to change failing drives and if I put the drives out of order than it would resilver. Back then it was okay because it was only one drive at a time. This time it's scary because it's trying to do 6 at the same time. I want to figure out which 5 drives are okay right now, tag them, and then try putting the other 6 in a different order (on different SATA ports). I'm not touching anything though until I hear from more people.
 

2nd-in-charge

Explorer
Joined
Jan 10, 2017
Messages
94
IBM 46M0997 ServeRAID Expansion Adapter 16-Port SAS Expander
Could be a problem with this? Or the breakout cables to SATA drives?
Updating firmware on those is no walk in the park.
https://forums.servethehome.com/ind...ders-60-ibm-lsi-chip-intel-alternative.11365/

I feel that my particular setup wants the drives in a certain order and that involves connecting them to each port in the right order.
This just should not be happening with direct attached drives.
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
I did not update BIOS/firmware on the IBM Expander card. I'll look into this now.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Latest firmware from online- it was version 20 from 2015/2016.
You still haven't answered the original question. There are 2 types of firmware available, IR and IT. Which version did you flash your card with?
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
IT
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
This Pool was probably, originally created with FreeNAS version 9 and upgraded through the years. I don't think I created the original pool via the GUI, so is it possible it's not retaining drive information/location unless it's placed in a specific order when connected to each SATA port?
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
This Pool was probably, originally created with FreeNAS version 9 and upgraded through the years. I don't think I created the original pool via the GUI, so is it possible it's not retaining drive information/location unless it's placed in a specific order when connected to each SATA port?
That shouldn't matter. With so many checksum errors I'd look to the cabling before I went any further.
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
I just flashed the firmware on the IBM Expander card- it's on 634a now. The version I had was old and known to cause problems. A problem I read about is that you won't be able to see certain drives. I never had that problem however, so I'm not sure if that was the root of the problem. I'm going to wait to hear some more suggestions before I turn on the FreeNAS.
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
I'm ordering new cables even though the cables I'm using are brand new. I'll order from a different vendor.
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
I'm going to have the new hardware tomorrow. Can someone please comment on suggestions of what to do after? I don't just want to turn it on with no plan.
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
@Chris Moore Chris- do you mind giving me some advice or perhaps you could reach out to someone that could be helpful here?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Latest firmware from online- it was version 20 from 2015/2016.
When you say that you updated the firmware, are you talking about the firmware in the SAS expander?

I really don't understand what wend wrong with your configuration. There must have been something strange with the previous configuration before you moved to the SAS expander. It should have been seamless. I moved my pool from a pair of SAS controllers with six drives on each controller to a SAS expander configuration and there was no trouble with that.
What kind of controller were your drives connected to before?
 

DCswitch

Explorer
Joined
Dec 20, 2013
Messages
58
@Chris Moore I don't know what went wrong either, but it got scary for me. I thought I had a backup, but it turns out I didn't. I am now building a 2nd FreeNAS as my backup and I'll be putting that offline in another location to protect from fire/theft/ransomware/etc and so I never have to feel stress like this ever again. I need to make sure I do everything right moving forward from here to give the best chance of not losing a lifetime of data. The previous configuration was using the 6 SATA connections on my motherboard and Asmedia controllers for the other 5 drives. It worked great for many years. I knew I was using non-ECC memory and cheap SATA controllers, so I finally decided to do it right. Moving over to the SAS controller is where all the trouble started. I updated the firmware only on the LSI controller, but not on the IBM expander (I didn't know that was possible) when I did the change and all the problems started. As of now- I have updated the IBM expander as well to the latest - 364a. I know many people said they had lots of problems from not upgrading the expander firmware, so that could fix this alone. Although- I remember moving a drive to the wrong SATA port in the past and then having the FreeNAS do a long resilver as if the drive was foreign to it. It makes me suspect that my build needs to have the drives in a specific port order. Moving forward- I don't want to turn it on until I have all the hardware upgraded and I have a plan in place. I have already received one of the SuperMicro motherboards and I'm waiting on the 2nd one to arrive within a few days. The new cables will be here Friday, so I can begin testing all the new (used) hardware thoroughly before I hook up any of the drives. Do you know if SuperMicro has a diagnostic disk for their motherboards? I really appreciate your input- thank you.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
makes me suspect that my build needs to have the drives in a specific port order.
The screenshot you provided suggests it too.

Edit: three of the drives that don't use gptid are affected (but more of the drives may need to be kept in particular order).

There are ways to escape from this but I'm not experienced enough to know whether it's a good moment to do this...

Sent from my phone
 
Last edited:
Top