Failing drive?

FlyingPersian · May 21, 2023

Hello,

last night at almost 4am (May 22nd) I received a mail from my NAS stating that one of the drives has been removed and is offline:

New alerts:
* Pool Data state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:

Disk WDC WD80EFAX-68KNBN0 xxxxxxx is REMOVED

I went on to check the status of my zpool:

About 5h before that mail, my pool resilvered about 2,2GB. For some reason, it lists one drive as ada3p2, not exactly sure why. The drive in question is ada5, so I ran a short SMART test:

I'm not sure how to interpret this. TrueNAS shows the pool as online, not in a degraded mode. Is my drive failing?

I have been meaning to replace the drives with bigger ones anyway. If I were to replace my Western Digital drives with Ironwolfs, would that be an issue when I'm running different type and sized drives for a bit until I replaced every drive?

Thanks in advance.

Whattteva · May 21, 2023

FlyingPersian said:
About 5h before that mail, my pool resilvered about 2,2GB. For some reason, it lists one drive as ada3p2, not exactly sure why. The drive in question is ada5, so I ran a short SMART test:

In my experience, when a drive's gptid isn't displayed, it's because it's offline. You could possibly be just seeing a bug though.

FlyingPersian said:
I have been meaning to replace the drives with bigger ones anyway. If I were to replace my Western Digital drives with Ironwolfs, would that be an issue when I'm running different type and sized drives for a bit until I replaced every drive?

I think it's generally not an issue, but you won't get your extra space until all of the drives have been replaced and since you have a 6-drive array, depending on how big and how filled they are, this replacement process could potentially take a long long long time even if you're not doing anything with the pool.

FlyingPersian · May 21, 2023

Whattteva said:
In my experience, when a drive's gptid isn't displayed, it's because it's offline. You could possibly be just seeing a bug though.

I think it's a bug because even under pool>pool status it shows all drives as online. I also saw that the resilvering actually finished at the same time I received the emails.

Whattteva said:
I think it's generally not an issue, but you won't get your extra space until all of the drives have been replaced and since you have a 6-drive array, depending on how big and how filled they are, this replacement process could potentially take a long long long time even if you're not doing anything with the pool.

Yeah I know, I've been through this a while ago. Just never upgraded with a entire different model of drives. Might stick to the WD Red Plus drives though.

sretalla · May 22, 2023

The smartctl output you're showing displays 1 CRC error on the drive.

Those are usually related to cabling or the SATA controller. Sometimes can indicate issues with the disk's onboard controller.

FlyingPersian · May 22, 2023

sretalla said:
The smartctl output you're showing displays 1 CRC error on the drive.

Those are usually related to cabling or the SATA controller. Sometimes can indicate issues with the disk's onboard controller.

What would be my next step then? Check the cabling? Just leave it as is and see if I have another error?

sretalla · May 22, 2023

FlyingPersian said:
Check the cabling? Just leave it as is and see if I have another error?

Either/both of those would be OK.

Keep an eye on it.

Davvo · May 22, 2023

Check (power and data) cabling, zpool clear Data, zpool scrub Data.
If you get errors again, issue is deeper.

Also don't rely on short tests only: regularly run long ones.

FlyingPersian · May 22, 2023

I actually used to have regular long tests a long time ago, not sure why I stopped them...

I'm running a scrub now, I'll to the long test afterwards and report back. It will take roughly 14h.

When I'm expanding to 6x 14TB of storage, I fear that my 32GB of RAM won't be enough. The most power hungry application I'm running is Plex and maybe the NFS server for my RPI cluster to store data on. The other services are just moving small amounts of data (<20GB) around once in a while. The 32GB probably isn't enough, right? My motherboard can only handle up to 32GB (Supermicro X10SLM-F). Would it be better for me to upgrade my MB as well to have more RAM?

Davvo · May 22, 2023

I don't think you will need to increase your RAM, 32GB are fine if you don't need performance: imho not worth the MB upgrade from your usage needs.

FlyingPersian · May 22, 2023

Davvo said:
I don't think you will need to increase your RAM, 32GB are fine if you don't need performance: imho not worth the MB upgrade from your usage needs.

Perfect, thank you. I'm happy with it's performance atm. I was afraid that increasing capacity meant more RAM is needed. The rule I remember from a few years ago was 1GB of RAM per 1TB of storage. Not sure how accurate that is now.

Davvo · May 22, 2023

FlyingPersian said:
Perfect, thank you. I'm happy with it's performance atm. I was afraid that increasing capacity meant more RAM is needed. The rule I remember from a few years ago was 1GB of RAM per 1TB of storage. Not sure how accurate that is now.

It's true if you have performance or professional/enterprise requirements (ie block storage), but for Plex and "simple" data storage you are fine with 32GB.
It doesn't mean you wouldn't benefit from an upgrade, but it's unlikely to be worth it economically or environmentally.

Important Announcement for the TrueNAS Community.

Failing drive?

FlyingPersian

Patron

Whattteva

Wizard

FlyingPersian

Patron

sretalla

Powered by Neutrality

FlyingPersian

Patron

sretalla

Powered by Neutrality

Davvo

MVP

FlyingPersian

Patron

Davvo

MVP

FlyingPersian

Patron

Davvo

MVP

Similar threads

Important Announcement for the TrueNAS Community.

Failing drive?

Patron

Wizard

Patron

Powered by Neutrality

Patron

Powered by Neutrality

MVP

Patron

MVP

Patron

MVP

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Failing drive?"

Similar threads