Upgrade to Corral almost end up as a disaster

Status
Not open for further replies.

Fritzolio

Explorer
Joined
Oct 12, 2015
Messages
63
OK, maybe I tried to do too much at one time. Box in question is a Supermicro SM846 with a X8DTE-F MB, 2 L5630's, 96GB of RAM (ECC RDIMM). Box has been up and running for about 2 years without any problems. I decided to make some major changes to make the box a bit less power hungry. I replaced the MB with a SM X9SCM-f, the CPU is a E3-1230v2 and 8GB of DDR3 ECC UDIMMS (2 x 4GB). I replaced the boot drive with a 64GB SATADOM. After the transplant I did a fresh install of FreeNAS 9.10 and loaded my config back in. All seemed well and when I was satisfied that it was I proceeded to upgrade to Corral.

All went well but after logging in I noticed Corral was reporting a S.M.A.R.T. error with one of the drives in my main vDEV (a 10 disk z2 using 3TB drives). I thought it was strange that this would pop up right after the upgrade but stranger thing have happened. I did have a spare in the box so proceeded to off line the "Bad" drive and replace it with the spare. Corral would not let me do this, something about there being something on the drive, yada yada yada. This put me in panic mode. Not that I was affraid of loosing the data but that everything was up to date and well organized and it would take me a long time to get back to square one if I lost it all. So in panic mode I undid the changes to the box, put the old HW bad in. The old boot drive was intact so all I had to do was boot her up. 9.10 reported the drive I off lined as missing as expected (I also removed the "bad" drive while I was at it. 9.10 allowed me to replace it with the spare and the reslivering when without issue so now I'm back where I began.

I spent all day testing the MB, the SATADOM, ram and the "bad" HD with every tool I have available and no problems have surfaced so far. At this point I'm kinda paranoid of upgrading again to Corral since I don't know what caused to above problem. I do know there are zero problems in the S.M.A.R.T. logs in the drive.

Also, the LSI HBA was also moved to the new MB and I've never had a problem with it and have reason to think it is suspect.

In retrospect, maybe I should have upgraded to Corral first and then after extensive testing and then did the HW changes. At this point I'm not sure how I'm going to proceed but I will take my time this time.

One outstanding thing about FreeNAS is that it's easy to handle situations like this. Cudos to the team for this.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
I think your biggest issue was going from 96GB of RAM with a 30TB pool to 8GB of RAM with a 30TB pool.
 

Fritzolio

Explorer
Joined
Oct 12, 2015
Messages
63
Thank you sir, that would indeed explain it. A little forethought on my part could have averted this crisis.
 
Status
Not open for further replies.
Top