Stripe Disk Replaced - System Down!!! (.. encountered an uncorrectable I/O failure)

SimonMdot

Dabbler
Joined
May 17, 2018
Messages
15
Hi all,

at home I am running a Truenas Server with 2 Pools
Pool 1 consists of 2X2 TB Disks as Mirror for important data and Backups.
Pool 2 consists of 1x8TB and 1x2TB (10TB) in Stripe Configuration as Mediaserver for Plex.

2 days ago did the 2TB disk of pool 2 report defect sectors and the pool was not running any more. I did shutdown the system and did install a replacement disk. By usining the replacement process of Truenas i did replace the broken disk.

Resilvering of the pool took the whole night and reported around 1.400.000 errors which made me suspicios already. But at ehe end the system reported a success and i did take out the broken disk.

The problem. After the restart the defect pool was available shortly. But as soon as I wanted to access the pool the whole system freezes and the UI is not responding any more. Also after a restart the system and UI is not available even without an access to the data.

When hooking up the NAS to a screen it shows:

WARNING: Pool 'MEDIA' has encountered an uncorrectable I/O failure and has been suspended.

Anyone any hints on how to proceed and what to do?
I assume the data on the pool will be lost.

Thanks and greetings,

Simon
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
2 days ago did the 2TB disk of pool 2

Error here : either they are the 2 drives from Pool 1 or it was the two other drives...

Pool 2 consists of 1x8TB and 1x2TB (10TB) in Stripe Configuration as Mediaserver for Plex.

BIG No-No here! Never put single drive vDev in any pool. You will loose it all in no time (if it is not too late...)

Please, post a zpool status here.

Anyone any hints on how to proceed and what to do?

There may not be anything to do... Your pool and data may already be lost...

Do you have backups ? If you do, that pool is to be destroyed and re-created anyway, so better to go straight to that step.

I assume the data on the pool will be lost.

Highly probable... Good that you start minding yourself accordingly...
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,599
Complete loss of a disk, (aka vDev with a single disk), in a striped pool will cause loss of that pool.

The slight exceptions to that, are:
- Just a few bad blocks, but the disk still has internal spares. So, spare them out and restore the affected file(s) from backup.
- No more spare blocks, but you can do an on-line replacement of a failing disk. See below.

For example, let us say you had several blocks bad on the 2TB disk in Pool 2. So you add another 2TB disk to the server, (but NOT to Pool 2). Then you use the replace option, which reads all the usable data from the source 2TB disk and writes it to the new 2TB disk. When complete, the source 2TB disk is automatically removed from Pool 2, (but not yet physically removed). Any bad files will have to be restored manually since their is no redundancy.

My media server uses ZFS, (but it's Gentoo Linux, not TrueNAS). And the media pool does not have any redundancy. So, every year or so I have to restore a media file from backups. (Which I have several...) Nothing to difficult. Upon complete loss of one of the striped disks, (1 x 2TB, 2.5" HDD and 1 x mSATA SSD), and I will have to restore the pool from scratch.

One weird thing about striped pools. My media server got an error on that media pool, except it automatically recovered. I head scratched over that for a while, then figured out the bad block(s) was in metadata. On ZFS, by default all metadata, (directory info, etc...), is redundant regardless of the type of vDevs used, (striped, Mirrored or RAID-Zx). Critical metadata even has 3 copies.
 
Top