0x4161726f6e
Dabbler
- Joined
- Jul 3, 2016
- Messages
- 19
Wanting more storage (and I don’t need IOPS) I migrated from a Zpool raid 1+0 with 6 HDDs to a raidZ2 with 7 HDDs. About a week after doing so (while on vacation) 2 HDDs come up as failed and all drives have read and write errors. When I VPN in, the system tells me something along the lines of not being able to communicate with the drives. In a panic I shutdown and pray to the storage gods that everything is fine. The shutdown had to be forced.
I get home from vacation, boot the server up, all drives register. There are some checksum errors but all read/write error counts are 0. Scrub, 2 HDDs have checksum errors and “failed”. I note which drives (da2 & da4), clear, and scrub again; everything is clean with 0 read/write errors. Maybe a false positive, I trust the drives to be ok.
Next day (about 12-14 hours later), a third HDD (da3) has 9 read, 96 write, 0 checksum, and “failed”; maybe the drives are OK but something else is wrong. I’ll try turning off spindown under advanced power management (set to 128) for all drives, even though this was never a problem. Scrub again because I can’t do much remotely; it comes back clean. But instantly the same HDD fails with 3 read errors.
TLDR: 3 out of 5 new drives are failing, and server was rock solid before changing/adding drives and changing Zpool configuration.
System:
FreeNAS-11.1-U4
Intel Xeon E3-1220 v3 @ 3.1GHz
ASRock E3C224D4I-14S
32 GB DDR3
Old Zpool: stripe across 3 mirrors
Mirror of 2 2 TB WD Reds
Mirror of 2 3 TB WD Reds
Mirror of 2 3 TB WD Greens
New Zpool:
RaidZ2 with 5 4TB WD Reds (all new drives) and 2 3TB WD Reds from old pool
Of the noted failures only new drives have failed.
Are my new drives failing or is something else wrong? What should I be checking?
I get home from vacation, boot the server up, all drives register. There are some checksum errors but all read/write error counts are 0. Scrub, 2 HDDs have checksum errors and “failed”. I note which drives (da2 & da4), clear, and scrub again; everything is clean with 0 read/write errors. Maybe a false positive, I trust the drives to be ok.
Next day (about 12-14 hours later), a third HDD (da3) has 9 read, 96 write, 0 checksum, and “failed”; maybe the drives are OK but something else is wrong. I’ll try turning off spindown under advanced power management (set to 128) for all drives, even though this was never a problem. Scrub again because I can’t do much remotely; it comes back clean. But instantly the same HDD fails with 3 read errors.
TLDR: 3 out of 5 new drives are failing, and server was rock solid before changing/adding drives and changing Zpool configuration.
System:
FreeNAS-11.1-U4
Intel Xeon E3-1220 v3 @ 3.1GHz
ASRock E3C224D4I-14S
32 GB DDR3
Old Zpool: stripe across 3 mirrors
Mirror of 2 2 TB WD Reds
Mirror of 2 3 TB WD Reds
Mirror of 2 3 TB WD Greens
New Zpool:
RaidZ2 with 5 4TB WD Reds (all new drives) and 2 3TB WD Reds from old pool
Of the noted failures only new drives have failed.
Are my new drives failing or is something else wrong? What should I be checking?