0x4161726f6e
Dabbler
- Joined
 - Jul 3, 2016
 
- Messages
 - 19
 
Wanting more storage (and I don’t need IOPS) I migrated from a Zpool raid 1+0 with 6 HDDs to a raidZ2 with 7 HDDs.  About a week after doing so (while on vacation) 2 HDDs come up as failed and all drives have read and write errors.  When I VPN in, the system tells me something along the lines of not being able to communicate with the drives.  In a panic I shutdown and pray to the storage gods that everything is fine.  The shutdown had to be forced.
I get home from vacation, boot the server up, all drives register. There are some checksum errors but all read/write error counts are 0. Scrub, 2 HDDs have checksum errors and “failed”. I note which drives (da2 & da4), clear, and scrub again; everything is clean with 0 read/write errors. Maybe a false positive, I trust the drives to be ok.
Next day (about 12-14 hours later), a third HDD (da3) has 9 read, 96 write, 0 checksum, and “failed”; maybe the drives are OK but something else is wrong. I’ll try turning off spindown under advanced power management (set to 128) for all drives, even though this was never a problem. Scrub again because I can’t do much remotely; it comes back clean. But instantly the same HDD fails with 3 read errors.
TLDR: 3 out of 5 new drives are failing, and server was rock solid before changing/adding drives and changing Zpool configuration.
System:
FreeNAS-11.1-U4
Intel Xeon E3-1220 v3 @ 3.1GHz
ASRock E3C224D4I-14S
32 GB DDR3
Old Zpool: stripe across 3 mirrors
Mirror of 2 2 TB WD Reds
Mirror of 2 3 TB WD Reds
Mirror of 2 3 TB WD Greens
New Zpool:
RaidZ2 with 5 4TB WD Reds (all new drives) and 2 3TB WD Reds from old pool
Of the noted failures only new drives have failed.
Are my new drives failing or is something else wrong? What should I be checking?
	
		
			
		
		
	
			
			I get home from vacation, boot the server up, all drives register. There are some checksum errors but all read/write error counts are 0. Scrub, 2 HDDs have checksum errors and “failed”. I note which drives (da2 & da4), clear, and scrub again; everything is clean with 0 read/write errors. Maybe a false positive, I trust the drives to be ok.
Next day (about 12-14 hours later), a third HDD (da3) has 9 read, 96 write, 0 checksum, and “failed”; maybe the drives are OK but something else is wrong. I’ll try turning off spindown under advanced power management (set to 128) for all drives, even though this was never a problem. Scrub again because I can’t do much remotely; it comes back clean. But instantly the same HDD fails with 3 read errors.
TLDR: 3 out of 5 new drives are failing, and server was rock solid before changing/adding drives and changing Zpool configuration.
System:
FreeNAS-11.1-U4
Intel Xeon E3-1220 v3 @ 3.1GHz
ASRock E3C224D4I-14S
32 GB DDR3
Old Zpool: stripe across 3 mirrors
Mirror of 2 2 TB WD Reds
Mirror of 2 3 TB WD Reds
Mirror of 2 3 TB WD Greens
New Zpool:
RaidZ2 with 5 4TB WD Reds (all new drives) and 2 3TB WD Reds from old pool
Of the noted failures only new drives have failed.
Are my new drives failing or is something else wrong? What should I be checking?