DEGRADED pool - too many errors

Status
Not open for further replies.

Osiris

Contributor
Joined
Aug 15, 2013
Messages
148
I have two zfs3 pools on FN11 with each 11 disks (so 3 spares) + 1 hotspare. Each disk is a 4TB WD Red.
All of a sudden, I'm getting this on 1 of my pools.
Apparently the hotspare was automatically used.
Only 1 pool is in error, the other pool with the exact same setup is fine. The access too this second pool it is waaaay less tbh.
I'm suspecting memory issues (non-ecc) and I'm going to change my hardware setup to intelXeon + tons of ecc ram.
The question I have is ... Can I import a degraded pool into a new system and redo the scrubs there?
Another thing: is smartctl data gone after a reboot? I'm getting no output for any disk, althoug regular short & long smart scans are scheduled.

upload_2017-9-12_11-28-25.png

+ a list of 10 files in error

This is my second pool:

upload_2017-9-12_11-44-41.png


That or my raid controller (Adaptec RAID 6805T/6805TQ) + intel raid expander is failing.
 
Last edited:

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Oh dear. Good news is you do have a lot of redundancy. Bad news is you might need it.

Yes. You should be able to resume a scrub on new hardware.

I'd suggest shutting down running memtest... and then retrying.

Yes, you can shutdown during a resilver.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

Osiris

Contributor
Joined
Aug 15, 2013
Messages
148
Adaptec raid controllers explicitly discouraged how when where why?
It's been doing it's job for 3 years now without hickups.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
The hardware recommendations guide, for starters.
 
Joined
Apr 9, 2015
Messages
1,258
Yep, Hardware raid can futz with a lot of things and that is not limited to but includes smart reporting.

That is the reason why hardware raid has been discouraged for quite a while. My system has been up and running since 5-2015 and I was reading about a lot of stuff before that point and it was in the guide well before that point. As was ECC RAM, not to mention the AMD CPU that you are showing for that system in your signature is not recommended.


Not saying things won't work but when they don't it could cause more issues that you will need to ferret out.
 

Osiris

Contributor
Joined
Aug 15, 2013
Messages
148
I'm not using hardware raid.
There is no hardware raid.
It doesn't exist.
Hardware raid is just software raid hardcoded on a chip which they sell to suckers.

I'm just using a raid controller to access 24 sas disks.
I'm using ZFS3.

I've just read the recommendations and indeed, you're promoting LSI material.
All that hardware besides the disks was recovered from old systems.
I wanted to test FN for a while, but stuck to it 'coz I loved it.
I'll now replace the mainboard, cpu and memory by proper stuff.

However, I wanted to keep the raid controller.
I don't feel like spending another 500 euros/dollars.
If this is not possible, coz it's a piece of crap, I'll replace it however.

Thanks for all the assitance and recommendations, but let's just assume for a minute that I don't have a money-crapping donkey and that want to recover my data.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
It's not just matter of hardware RAID, it's the drivers. Adaptec has has a very poor record delivering reliable FreeBSD drivers.

don't feel like spending another 500 euros/dollars.
Sell your controller, buy an LSI SAS 9211, pocket 300 bucks. It's win-win (assuming your price is realistic).
 

rs225

Guru
Joined
Jun 28, 2014
Messages
878
Shut down and run a memtest. Re-seat all cables. It is very high odds this is your expander, adapter, or a common cable. Bad memory usually doesn't give checksum errors across every disk.

Do not add any new spares or attempt to replace any drives. You can move the pool to other hardware if you have it.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
You're also running an awful lot of storage for 16GB of RAM.
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
You should be able to resume a scrub on new hardware

Assuming the raid controller hasn't done something funky to the drives.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
assume for a minute that I don't have a money-crapping donkey
There's a very widespread, but curious and usually false, assumption that proper hardware for FreeNAS is expensive. I wonder where it comes from. A suitable disk controller costs around US$100.
 

Osiris

Contributor
Joined
Aug 15, 2013
Messages
148
There's a very widespread, but curious and usually false, assumption that proper hardware for FreeNAS is expensive. I wonder where it comes from. A suitable disk controller costs around US$100.
Not one that can support 24 drives, imho.

Anyway, thanks for all the advice.
The adaptec raid controller, the intel raid expander, the cables, memory, ... Lots of things to check.
Luckily, up till now, I can access my data like nothing happened.

On the only-16-GB remark: yes. It was the maximum for the mainboard.
Like I said, it's old junk. I had it in a pc for a couple, then 5 years in a linux server and now the 4th year in a freenas system.
I must say, that AMD system has performed great until now (if you don't take power consumption into account).
And chances are that the issue has nothing to do with mainboard, memory or cpu.
For that reason I already bought - just last week - a Xeon/AsRock/64G-ECC combo to replace the whole shebang.

I'm just quite reluctant to lose my data.
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Not one that can support 24 drives, imho.
Your HO is incorrect; with a SAS expander (which you already have), a single LSI 2008/2308/3008 card will support an effectively-unlimited number of disks.
 

Osiris

Contributor
Joined
Aug 15, 2013
Messages
148
Does freenas support all of the LSI raid controllers, when it comes to drivers?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Yes, but you want an HBA, not a RAID controller.

Look for anything with an SAS2008, SAS2308, or SAS3008 controller.
 

Osiris

Contributor
Joined
Aug 15, 2013
Messages
148
Would an HBA support my Intel raid expander?
How up to date is this FN hardware guide?
I'm looking at SAS 9207-8i Host Bus Adapter since my norco case has SFF8087 connections built-in. Would this be FN compliant?
 
Last edited:

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Is it an LSI 2008/2308/3008-based card
The SAS 9207 uses the SAS2308, so it's a good choice.

How up to date is this FN hardware guide?
I don't know, but the hardware recommendations guide in the resources section is up to date (it doesn't cover Xeon Scalable yet, but it's a platform of dubious value to most users). Link is also in my sig.

my norco case has SFF8087 connections built-in.
One for every four drives? That's just an electrical connection, with no logic, which is fine (since you have the separate expander).
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
Status
Not open for further replies.
Top