morxy49
Contributor
- Joined
- Jan 12, 2014
- Messages
- 145
So a couple of days ago i woke up to an email saying "The volume pandora_vol1 (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state."
Then another mail immediately afterwards saying "The volume pandora_vol1 (ZFS) state is UNAVAIL: One or more devices are faulted in response to IO failures."
Having other things to do that day, I hoped it wasn't terribly serious and left it for a couple of days. So today I started investigating the issue and noticed that it seemed really easy to fix. According to the documentation I was just supposed to run "zpool clear pandora_vol1" and it should be solved. At first, it actually seemed like it was solved. The pool got mounted again, and all the files were there. But then i started trying to access some files (via my Windows PC), and just got an error message saying "The request could not be performed because of an I/O device error."
So I check further, and see the following shit storm...
How the hell did this happen?
That is a shit load of errors!
It also looks like two (!) of the drives have failed!? That is very strange that two drives fail at the same time. I've never had any issues with this pool before, ever.
The pool in question has 4x4TB drives in RAIDz1, so if two drives actually has failed, then I'm pretty screwed...
I've looked at some previous threads and all of them always ask if there's ECC-RAM, and yes, I do have ECC-RAM, so that should not be an issue.
How do I proceed with this? The data is not super sensitive, and is technically replaceable. But it would take a hell of a lot more work than it's worth.
Then another mail immediately afterwards saying "The volume pandora_vol1 (ZFS) state is UNAVAIL: One or more devices are faulted in response to IO failures."
Having other things to do that day, I hoped it wasn't terribly serious and left it for a couple of days. So today I started investigating the issue and noticed that it seemed really easy to fix. According to the documentation I was just supposed to run "zpool clear pandora_vol1" and it should be solved. At first, it actually seemed like it was solved. The pool got mounted again, and all the files were there. But then i started trying to access some files (via my Windows PC), and just got an error message saying "The request could not be performed because of an I/O device error."

So I check further, and see the following shit storm...
How the hell did this happen?
Code:
[root@PandorasBox] ~# zpool status -v pool: freenas-boot state: ONLINE scan: scrub repaired 0 in 0h1m with 0 errors on Tue Aug 15 04:01:53 2017 config: NAME STATE READ WRITE CKSUM freenas-boot ONLINE 0 0 0 da0p2 ONLINE 0 0 0 errors: No known data errors pool: pandora_vol0 state: ONLINE scan: scrub repaired 0 in 20h48m with 0 errors on Sat Jul 8 20:49:04 2017 config: NAME STATE READ WRITE CKSUM pandora_vol0 ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gptid/b013111e-637d-11e5-b80f-d050990a6173 ONLINE 0 0 0 gptid/12b762b6-adfd-11e3-ac90-d050990a6173 ONLINE 0 0 0 gptid/1393a459-adfd-11e3-ac90-d050990a6173 ONLINE 0 0 0 gptid/1473b851-adfd-11e3-ac90-d050990a6173 ONLINE 0 0 0 gptid/155235f6-adfd-11e3-ac90-d050990a6173 ONLINE 0 0 0 gptid/163182c3-adfd-11e3-ac90-d050990a6173 ONLINE 0 0 0 gptid/31336141-0801-11e6-a659-bc5ff4fda91c ONLINE 0 0 0 gptid/fedbc3dd-0710-11e6-9e29-bc5ff4fda91c ONLINE 0 0 0 errors: No known data errors pool: pandora_vol1 state: UNAVAIL status: One or more devices are faulted in response to IO failures. action: Make sure the affected devices are connected, then run 'zpool clear'. see: http://illumos.org/msg/ZFS-8000-JQ scan: resilvered 0 in 0h0m with 0 errors on Tue Aug 15 17:35:52 2017 config: NAME STATE READ WRITE CKSUM pandora_vol1 UNAVAIL 0 0 0 raidz1-0 UNAVAIL 0 0 4 1092007741518804051 REMOVED 0 0 0 was /dev/gptid/384c974d-f5d8-11e6-ab82-bc5ff4fda91c gptid/397c7a4a-f5d8-11e6-ab82-bc5ff4fda91c ONLINE 0 0 0 gptid/3a7a6ddf-f5d8-11e6-ab82-bc5ff4fda91c ONLINE 0 0 0 10045439086884384952 REMOVED 0 0 0 was /dev/gptid/3b6925bd-f5d8-11e6-ab82-bc5ff4fda91c errors: Permanent errors have been detected in the following files: <metadata>:<0x0> <metadata>:<0x1> <metadata>:<0x1b> <metadata>:<0x7c> pandora_vol1/Media:<0x2000> pandora_vol1/Media:<0x3000> pandora_vol1/Media:<0x3400> pandora_vol1/Media:<0x3700> pandora_vol1/Media:<0x3401> pandora_vol1/Media:<0x3901> pandora_vol1/Media:<0x3804> pandora_vol1/Media:<0x3205> pandora_vol1/Media:<0x2907> pandora_vol1/Media:<0xd08> pandora_vol1/Media:<0x1108> pandora_vol1/Media:<0x2108> pandora_vol1/Media:<0x3009> pandora_vol1/Media:<0x80a> pandora_vol1/Media:<0x370a> pandora_vol1/Media:<0xe0c> pandora_vol1/Media:<0x130d> pandora_vol1/Media:<0x1f12> pandora_vol1/Media:<0x3012> pandora_vol1/Media:<0x2e13> pandora_vol1/Media:<0x914> pandora_vol1/Media:<0x1e14> pandora_vol1/Media:<0x2016> pandora_vol1/Media:<0x2716> pandora_vol1/Media:<0x2e17> pandora_vol1/Media:<0x3517> pandora_vol1/Media:<0x718> pandora_vol1/Media:<0xa18> pandora_vol1/Media:<0x419> pandora_vol1/Media:<0xc1a> pandora_vol1/Media:<0xa1b> pandora_vol1/Media:<0x351c> pandora_vol1/Media:<0x241d> pandora_vol1/Media:<0x371e> pandora_vol1/Media:<0xe20> pandora_vol1/Media:<0x3920> pandora_vol1/Media:<0x1224> pandora_vol1/Media:<0x1025> pandora_vol1/Media:<0x3a26> pandora_vol1/Media:<0x2729> pandora_vol1/Media:<0x3629> pandora_vol1/Media:<0x322a> pandora_vol1/Media:<0x392b> pandora_vol1/Media:<0x2e2f> pandora_vol1/Media:<0x1531> pandora_vol1/Media:<0x2e31> pandora_vol1/Media:<0x432> pandora_vol1/Media:<0x832> pandora_vol1/Media:<0x3833> pandora_vol1/Media:<0x1d35> pandora_vol1/Media:<0x3935> pandora_vol1/Media:<0x637> pandora_vol1/Media:<0x1038> pandora_vol1/Media:<0x1a3e> pandora_vol1/Media:<0x183f> pandora_vol1/Media:<0x343f> pandora_vol1/Media:<0x3640> pandora_vol1/Media:<0x2e42> pandora_vol1/Media:<0x1343> pandora_vol1/Media:<0x1f45> pandora_vol1/Media:<0x3847> pandora_vol1/Media:<0x1e49> pandora_vol1/Media:<0x2e4a> pandora_vol1/Media:<0x2f4a> pandora_vol1/Media:<0x384b> pandora_vol1/Media:<0x364c> pandora_vol1/Media:<0x384c> pandora_vol1/Media:<0x384d> pandora_vol1/Media:<0x1b4e> pandora_vol1/Media:<0x2f4e> pandora_vol1/Media:<0x324e> pandora_vol1/Media:<0x2e50> pandora_vol1/Media:<0x3851> pandora_vol1/Media:<0x3951> pandora_vol1/Media:<0x2e52> pandora_vol1/Media:<0xf53> pandora_vol1/Media:<0x2e57> pandora_vol1/Media:<0x3657> pandora_vol1/Media:<0x3857> pandora_vol1/Media:<0x365b> pandora_vol1/Media:<0x385b> pandora_vol1/Media:<0x2f5d> pandora_vol1/Media:<0x155f> pandora_vol1/Media:<0x345f> pandora_vol1/Media:<0x460> pandora_vol1/Media:<0x2160> pandora_vol1/Media:<0x3462> pandora_vol1/Media:<0x3962> pandora_vol1/Media:<0x169> pandora_vol1/Media:<0x669> pandora_vol1/Media:<0x2b69> pandora_vol1/Media:<0x3469> pandora_vol1/Media:<0x3869> pandora_vol1/Media:<0x16a> pandora_vol1/Media:<0x316a> pandora_vol1/Media:<0x166c> pandora_vol1/Media:<0x376d> pandora_vol1/Media:<0x386d> pandora_vol1/Media:<0x386f> pandora_vol1/Media:<0x975> pandora_vol1/Media:<0x3675> pandora_vol1/Media:<0x3775> pandora_vol1/Media:<0x3975> pandora_vol1/Media:<0x2c76> pandora_vol1/Media:<0x2077> pandora_vol1/Media:<0x3877> pandora_vol1/Media:<0x2578> pandora_vol1/Media:<0x2678> pandora_vol1/Media:<0x397a> pandora_vol1/Media:<0x1f7c> pandora_vol1/Media:<0x2e7c> pandora_vol1/Media:<0x1a7d> pandora_vol1/Media:<0x37f> pandora_vol1/Media:<0x2d7f> pandora_vol1/Media:<0x2e7f> pandora_vol1/Media:<0x397f> pandora_vol1/Media:<0x3681> pandora_vol1/Media:<0x1482> pandora_vol1/Media:<0xe83> pandora_vol1/Media:<0x384> pandora_vol1/Media:<0x3685> pandora_vol1/Media:<0x3985> pandora_vol1/Media:<0xa86> pandora_vol1/Media:<0xf87> pandora_vol1/Media:<0x2d87> pandora_vol1/Media:<0x2e88> pandora_vol1/Media:<0x389> pandora_vol1/Media:<0x989> pandora_vol1/Media:<0x1789> pandora_vol1/Media:<0x3889> pandora_vol1/Media:<0x2b8a> pandora_vol1/Media:<0x2a8b> pandora_vol1/Media:<0x88d> pandora_vol1/Media:<0x1b8d> pandora_vol1/Media:<0x368d> pandora_vol1/Media:<0x38e> pandora_vol1/Media:<0x1a8f> pandora_vol1/Media:<0x2d8f> pandora_vol1/Media:<0xc90> pandora_vol1/Media:<0x3691> pandora_vol1/Media:<0x3991> pandora_vol1/Media:<0x2f94> pandora_vol1/Media:<0x395> pandora_vol1/Media:<0x3695> pandora_vol1/Media:<0x3995> pandora_vol1/Media:<0x2e96> pandora_vol1/Media:<0x398> pandora_vol1/Media:<0x3499> pandora_vol1/Media:<0x2e9a> pandora_vol1/Media:<0x369a> pandora_vol1/Media:<0x39b> pandora_vol1/Media:<0x249b> pandora_vol1/Media:<0x189c> pandora_vol1/Media:<0x349d> pandora_vol1/Media:<0x39e> pandora_vol1/Media:<0x179e> pandora_vol1/Media:<0x369f> pandora_vol1/Media:<0x15a1> pandora_vol1/Media:<0x3a2> pandora_vol1/Media:<0x2ea2> pandora_vol1/Media:<0x39a2> pandora_vol1/Media:<0x5a3> pandora_vol1/Media:<0x2ea5> pandora_vol1/Media:<0x3a6> pandora_vol1/Media:<0x21a8> pandora_vol1/Media:<0x30a8> pandora_vol1/Media:<0x13a9> pandora_vol1/Media:<0x38ab> pandora_vol1/Media:<0x16ac> pandora_vol1/Media:<0x1cac> pandora_vol1/Media:<0x39ac> pandora_vol1/Media:<0xfad> pandora_vol1/Media:<0x3af> pandora_vol1/Media:<0x19af> pandora_vol1/Media:<0x38af> pandora_vol1/Media:<0x3b0> pandora_vol1/Media:<0x1fb0> pandora_vol1/Media:<0x30b0> pandora_vol1/Media:<0x1db2> pandora_vol1/Media:<0x3b4> pandora_vol1/Media:<0x2db4> pandora_vol1/Media:<0x2fb6> pandora_vol1/Media:<0xbb7> pandora_vol1/Media:<0x6b9> pandora_vol1/Media:<0x3ba> pandora_vol1/Media:<0x5ba> pandora_vol1/Media:<0x2dbc> pandora_vol1/Media:<0x30bd> pandora_vol1/Media:<0x3be> pandora_vol1/Media:<0x1bbe> pandora_vol1/Media:<0x38bf> pandora_vol1/Media:<0x2fc0> pandora_vol1/Media:<0x19c1> pandora_vol1/Media:<0x1ac3> pandora_vol1/Media:<0x3c4> pandora_vol1/Media:<0x39c4> pandora_vol1/Media:<0x38c8> pandora_vol1/Media:<0x3c9> pandora_vol1/Media:<0x2dc9> pandora_vol1/Media:<0x2fc9> pandora_vol1/Media:<0x15cb> pandora_vol1/Media:<0x39cb> pandora_vol1/Media:<0x3cf> pandora_vol1/Media:<0x13cf> pandora_vol1/Media:<0x1bd0> pandora_vol1/Media:<0x3d1> pandora_vol1/Media:<0x2dd1> pandora_vol1/Media:<0x20d5> pandora_vol1/Media:<0x3d6> pandora_vol1/Media:<0xdd7> pandora_vol1/Media:<0x8d8> pandora_vol1/Media:<0x30d8> pandora_vol1/Media:<0x36d8> pandora_vol1/Media:<0xfd9> pandora_vol1/Media:<0x38d9> pandora_vol1/Media:<0x3da> pandora_vol1/Media:<0x2dda> pandora_vol1/Media:<0x11db> pandora_vol1/Media:<0x2bdc> pandora_vol1/Media:<0x34dd> pandora_vol1/Media:<0x3de> pandora_vol1/Media:<0x38de> pandora_vol1/Media:<0x4e1> pandora_vol1/Media:<0x2de1> pandora_vol1/Media:<0x34e2> pandora_vol1/Media:<0x33e3> pandora_vol1/Media:<0x6e4> pandora_vol1/Media:<0x3e6> pandora_vol1/Media:<0x3e7> pandora_vol1/Media:<0x16e7> pandora_vol1/Media:<0x20e7> pandora_vol1/Media:<0x29e7> pandora_vol1/Media:<0x2de7> pandora_vol1/Media:<0x38e7> pandora_vol1/Media:<0x33e9> pandora_vol1/Media:<0x2deb> pandora_vol1/Media:<0x3ec> pandora_vol1/Media:<0x1ded> pandora_vol1/Media:<0x9f0> pandora_vol1/Media:<0x26f5> pandora_vol1/Media:<0x2ff8> pandora_vol1/Media:<0x38f8> pandora_vol1/Media:<0x33fc>
That is a shit load of errors!
It also looks like two (!) of the drives have failed!? That is very strange that two drives fail at the same time. I've never had any issues with this pool before, ever.
The pool in question has 4x4TB drives in RAIDz1, so if two drives actually has failed, then I'm pretty screwed...
I've looked at some previous threads and all of them always ask if there's ECC-RAM, and yes, I do have ECC-RAM, so that should not be an issue.
- FreeNAS-9.10.1-U2 (f045a8b)
- AsRock C2550D4I
- 4x Seagate Desktop HDD ST4000DM000 64MB 4TB in RAIDz1
- 2x Crucial DDR3 PC12800/1600MHz ECC 8GB (CT2KIT102472BD160B)
- 2x Kingston ValueRAM DDR3 PC10600/1333MHz ECC CL9 8GB (KVR1333D3E9S/8G)
How do I proceed with this? The data is not super sensitive, and is technically replaceable. But it would take a hell of a lot more work than it's worth.