Pool remains DEGRADED

angelus249

Dabbler
Joined
Dec 19, 2014
Messages
41
Hi,

I run the system as shown in my signature. I run scrubs/long SMART tests on all drives biweekly, as recommended.

Now a short explanation of what happened on the 5x4TB pool (the drives are 4x 8 years old and 1x 6 years)

1) One drive started showing SMART errors, indicating failure in the near future. So I ordered a new drive, but wasn't home for a few days to replace it.
2) In the meantime a 2nd drive started throwing SMART errors.
3) I started resilvering - expecting the pool to die here due to the 2nd damaged disk - but it finished somewhat "successfully". While resilvering a third disk, however, which hadn't shown any errors prior to that (the 6 year old one), threw heavy errors and died about 6 hours after the resilvering finished. Resilvering finsihed, but at this point with errors.
4) I did backup/copy everything and eventually knew 3 files on the vdev were damaged/unreadable. Even console showed "input/output error" while trying to move/copy it.
5) Long story short, I resilvered twice more, a total of 3 times and all my data is still there (except for the 3 files). At this point there are 3 new drives and 2 old drives in the pool. But all drives and the pool remain DEGRADED.
6) So I tried to scrub it, but very early in the process it showed some errors and and as soon as the read errors occured, I got an email about the pool still being degraded. So I deleted the 3 files I knew were damaged and then scrubbed again, now the scrub finished with 0 errors. I also restarted the whole box.

After the scrub finished, however, I got yet another new error message
New alerts:
* Pool MyMedia02 state is DEGRADED: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.
The following devices are not healthy:
  • Disk ST4000VN000-2AH166 *** is DEGRADED
  • Disk TOSHIBA MD04ACA400 *** is DEGRADED
  • Disk TOSHIBA MD04ACA400 *** is DEGRADED
  • Disk TOSHIBA MD04ACA400 *** is DEGRADED
  • Disk TOSHIBA MD04ACA400 *** is DEGRADED
However, those drives don't even exist anymore in that setup. Two more of those Toshiba drives have been replaced for Seagate drives. Something's not right here.
All drives are OK now, all data is accessible/readable, but the pool is still degraded. How do I fix that without rebuilding the whole pool?
1665330588064.png

1665330716778.png


Any ideads?

Thanks a lot.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Did you offline those drives before replacing? I don't know what the proper course would be now.
 

angelus249

Dabbler
Joined
Dec 19, 2014
Messages
41
Did you offline those drives before replacing? I don't know what the proper course would be now.
Yes, I did. At the least the two which "just" had SMART errors. The third drive, which died unexpectly, was automatically set in status "FAULTED" and pressing the "offline" button didn't have any effect. So I just unplugged the faulted drive and inserted the new one, then replaced/resilvered.

I also rolled back now to 12.0-U8.1, cause I had the weirdest SMB errors/behaviour. Those are gone now. But even on the rollback, the new error message still says drives are degraded, which are not plugged in anymore. That's just wrong and weird :(
1665410902557.png
 

angelus249

Dabbler
Joined
Dec 19, 2014
Messages
41
I can't really wait much longer, so I've given up trying to solve this. I deconstructed the pool and rebuilding it now from scratch, restoring everything from backups.

Thanks anyway.

But there's certainly something not ok with TrueNAS. Disks not even connected shouldn't be mentioned in error messages.
 

Glorious1

Guru
Joined
Nov 23, 2014
Messages
1,211
Disks not even connected shouldn't be mentioned in error messages.
Well in this case perhaps they shouldn't, but you can understand that TrueNAS has to keep a record of the drives in a pool, and if someone just yanks one or more of them, TrueNAS should tell you they are not found or not working.
 
Top