Failing drive?

FlyingPersian

Patron
Joined
Jan 27, 2014
Messages
237
Hello,

last night at almost 4am (May 22nd) I received a mail from my NAS stating that one of the drives has been removed and is offline:

New alerts:
* Pool Data state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
  • Disk WDC WD80EFAX-68KNBN0 xxxxxxx is REMOVED

I went on to check the status of my zpool:

Screenshot_20230522_092711_Termius.jpg


About 5h before that mail, my pool resilvered about 2,2GB. For some reason, it lists one drive as ada3p2, not exactly sure why. The drive in question is ada5, so I ran a short SMART test:

Screenshot_20230522_092357_Termius.jpg



I'm not sure how to interpret this. TrueNAS shows the pool as online, not in a degraded mode. Is my drive failing?

I have been meaning to replace the drives with bigger ones anyway. If I were to replace my Western Digital drives with Ironwolfs, would that be an issue when I'm running different type and sized drives for a bit until I replaced every drive?

Thanks in advance.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
About 5h before that mail, my pool resilvered about 2,2GB. For some reason, it lists one drive as ada3p2, not exactly sure why. The drive in question is ada5, so I ran a short SMART test:
In my experience, when a drive's gptid isn't displayed, it's because it's offline. You could possibly be just seeing a bug though.

I have been meaning to replace the drives with bigger ones anyway. If I were to replace my Western Digital drives with Ironwolfs, would that be an issue when I'm running different type and sized drives for a bit until I replaced every drive?
I think it's generally not an issue, but you won't get your extra space until all of the drives have been replaced and since you have a 6-drive array, depending on how big and how filled they are, this replacement process could potentially take a long long long time even if you're not doing anything with the pool.
 

FlyingPersian

Patron
Joined
Jan 27, 2014
Messages
237
In my experience, when a drive's gptid isn't displayed, it's because it's offline. You could possibly be just seeing a bug though.
I think it's a bug because even under pool>pool status it shows all drives as online. I also saw that the resilvering actually finished at the same time I received the emails.

I think it's generally not an issue, but you won't get your extra space until all of the drives have been replaced and since you have a 6-drive array, depending on how big and how filled they are, this replacement process could potentially take a long long long time even if you're not doing anything with the pool.

Yeah I know, I've been through this a while ago. Just never upgraded with a entire different model of drives. Might stick to the WD Red Plus drives though.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
The smartctl output you're showing displays 1 CRC error on the drive.

Those are usually related to cabling or the SATA controller. Sometimes can indicate issues with the disk's onboard controller.
 

FlyingPersian

Patron
Joined
Jan 27, 2014
Messages
237
The smartctl output you're showing displays 1 CRC error on the drive.

Those are usually related to cabling or the SATA controller. Sometimes can indicate issues with the disk's onboard controller.
What would be my next step then? Check the cabling? Just leave it as is and see if I have another error?
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Check (power and data) cabling, zpool clear Data, zpool scrub Data.
If you get errors again, issue is deeper.

Also don't rely on short tests only: regularly run long ones.
 

FlyingPersian

Patron
Joined
Jan 27, 2014
Messages
237
I actually used to have regular long tests a long time ago, not sure why I stopped them...

I'm running a scrub now, I'll to the long test afterwards and report back. It will take roughly 14h.

When I'm expanding to 6x 14TB of storage, I fear that my 32GB of RAM won't be enough. The most power hungry application I'm running is Plex and maybe the NFS server for my RPI cluster to store data on. The other services are just moving small amounts of data (<20GB) around once in a while. The 32GB probably isn't enough, right? My motherboard can only handle up to 32GB (Supermicro X10SLM-F). Would it be better for me to upgrade my MB as well to have more RAM?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I don't think you will need to increase your RAM, 32GB are fine if you don't need performance: imho not worth the MB upgrade from your usage needs.
 

FlyingPersian

Patron
Joined
Jan 27, 2014
Messages
237
I don't think you will need to increase your RAM, 32GB are fine if you don't need performance: imho not worth the MB upgrade from your usage needs.

Perfect, thank you. I'm happy with it's performance atm. I was afraid that increasing capacity meant more RAM is needed. The rule I remember from a few years ago was 1GB of RAM per 1TB of storage. Not sure how accurate that is now.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Perfect, thank you. I'm happy with it's performance atm. I was afraid that increasing capacity meant more RAM is needed. The rule I remember from a few years ago was 1GB of RAM per 1TB of storage. Not sure how accurate that is now.
It's true if you have performance or professional/enterprise requirements (ie block storage), but for Plex and "simple" data storage you are fine with 32GB.
It doesn't mean you wouldn't benefit from an upgrade, but it's unlikely to be worth it economically or environmentally.
 
Last edited:
Top