HELP! Not sure how to proceed. Scrub shows degraded and faulted drive(s)

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Can you post the zpool status output?
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
Keep this card on server, for now. Where did you get the disks from, a datacenter? Can you destroy the pool, take all disks out and zero-format one with 512B sectors?
Code:
# sg_scan -i
# sg_format --format --size=512 /dev/sg0

Reboot the server and see if the disk can be added to a pool with no errors. I saw in the past issues with OEM branded disks formatted from the factory with a non standard sector size of 520B or 528B.
 

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
Can you post the zpool status output?
1663266031246.png
 

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
Keep this card on server, for now. Where did you get the disks from, a datacenter? Can you destroy the pool, take all disks out and zero-format one with 512B sectors?
Yes, I can do this. I have no issues with destroying the pool if that makes getting to a solution easier.
We just purchased the drives off of Amazon. I don't believe they are anything special.

Output of sg_scan -i:
1663266346776.png
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
We just purchased the drives off of Amazon.
That means the vendor had them properly formatted, if they were used into datacenter. I thought maybe the disks came with the server as package.

I’m sure the disks are fine, but let’s try to format one. I know is a pain but if we test one disk only, we eliminate the disks from equation. To give you an idea, I spent one week troubleshooting the issue with my HBA PCIe cards.

Your issue is very common, card is not seated properly and can be fixed easy. But in your case everything we try throw us off. :)
 
Last edited:

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
That means the vendor had them properly formatted, if they were used into datacenter. I thought maybe the disks came with the server as package.
The HP drives did come with the server as part of the purchase. Was purchased off of Ebay. Not sure if it was used in datacenter or not.
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
The HP drives did come with the server as part of the purchase. Was purchased off of Ebay. Not sure if it was used in datacenter or not.
I see, they are probably formatted with bad sectors, most HP OEM disks are. Let’s try the command I posted earlier on a HP drive that came with the server. Make sure you only have one disk installed.

You will probably have a hard time destroying the pool, like I did.
 
Last edited:

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
I see, they are probably formatted with bad sectors, most HP OEM disks are. Let’s try the command I posted earlier on a HP drive that came with the server. Make sure you only have one disk installed.

You will probably have a hard time destroying the pool, like I did.
Do I attempt to destroy the pool and then remove all the other disks? Or should I remove all but one of the HP disks and then perform that command?
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
That's an awfully wide Z1! I use a Z1 often, generally it's 3 drives per stripe though (better speeds for my uses). I wouldn't recommend 11 drives in a single Z1 stripe. I've seen this discussed on the ZFS subreddit (certain large drive systems), but it's generally freebsd. One theory is it's a driver bug. I'm not sure.

So, degraded refers to checksum errors but the drives likely still work, I see the resilver reported zero errors. Have you ever done the zpool clear? The clear should change the status if all is well.
 

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
That's an awfully wide Z1! I use a Z1 often, generally it's 3 drives per stripe though (better speeds for my uses). I wouldn't recommend 11 drives in a single Z1 stripe. I've seen this discussed on the ZFS subreddit (certain large drive systems), but it's generally freebsd. One theory is it's a driver bug. I'm not sure.
Should I recreate the pool with z2 or z3? We went with z1 more for available storage in the pool. We should have purchased larger drives. We just assumed z1 would be good enough.
So, degraded refers to checksum errors but the drives likely still work, I see the resilver reported zero errors. Have you ever done the zpool clear? The clear should change the status if all is well.
I just did a zpool clear. It looks alright now. Does this mean the drives are OK?
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Might be. You may have solved the issue with all the previous steps. zfs would not say no errors on resilver if they were not ok (presuming this was the only resilver, if not it depends if the first resilver was also no errors corrected or not). If it doesn't reoccur, problem solved. When drives say degraded but it appears everything is working and resilver ok, you do the zpool clear to change the state back to ok. You do not just randomly do that though if you have real errors. Your OP shows 1 write error and the rest read. The various steps in this thread were good to try and resolve the issue, but were missing the clear to get rid of degraded.

I don't know the Scale UI yet, so, not sure if it has a clear function having never had a failure yet. It probably does, just never looked.

So, watch it, and see if any errors come back. If it ever happens again, try to fix something, resilver and if all ok, clear. Without the clear, no way to know which step might have helped.

As far as RaidZ, I would recommend a Z2 with that many drives. The old RAID5 is obsolete and you will never resilver without a URE is an old mostly non true concept that someone started. First, a URE doesn't necessarily stop a RaidZ resilver like the old Raid. Second, you likely do weekly scrubs and that checks the drives in advance, whereas people often didn't scrub the old Raid5 arrays and therefore they could have had errors preventing rebuilds for months or even years. ZFS is a lot more fault tolerant. That being said, that many drives on a RaidZ1 is more risk than I would want. Still, Raid is never a backup so if you have good backups, it's more about uptime than anything else.

As an aside, as a longtime system admin, I've probably done I'll guess 25 Raid 5 rebuilds over the last 20 years. All 25 succeeded. If you believe that original article, I should have 0% chance basically of all of them working (ok some number of zeroes after a decimal point followed by a 1). But we did scrub. Still, hardware raid5 was much much riskier than zfs. ZFS <> RAID

Before Truenas, I had a box with 10 drives, 3 stripes of 3 RaidZ (wanting the sequential read performance, lots of video access). Could have done 5 mirrors but too much space lost. All the drives were server pulls and used. I lost about 4 drives over 5 years, every resilver worked fine just as it should with ZFS. But If you truly had serious drive failures and 2 of them (more likely with your large 1 RaidZ1 set), it is much less fault tolerant. The general rule in ZFS that most people accept is largest set is 3 drives for a RaidZ1. Esp if you have larger drives.
 
Last edited:

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
Might be. You may have solved the issue with all the previous steps. zfs would not say no errors on resilver if they were not ok (presuming this was the only resilver, if not it depends if the first resilver was also no errors corrected or not). If it doesn't reoccur, problem solved. When drives say degraded but it appears everything is working and resilver ok, you do the zpool clear to change the state back to ok. You do not just randomly do that though if you have real errors. Your OP shows 1 write error and the rest read. The various steps in this thread were good to try and resolve the issue, but were missing the clear to get rid of degraded.
I'll keep an eye on things and see what happens. I'm probably going to end up re-creating this pool anyways.
Before Truenas, I had a box with 10 drives, 3 stripes of 3 RaidZ (wanting the sequential read performance, lots of video access). Could have done 5 mirrors but too much space lost. All the drives were server pulls and used. I lost about 4 drives over 5 years, every resilver worked fine just as it should with ZFS. But If you truly had serious drive failures and 2 of them (more likely with your large 1 RaidZ1 set), it is much less fault tolerant. The general rule in ZFS that most people accept is largest set is 3 drives for a RaidZ1. Esp if you have larger drives.
Is there any way to change the RaidZ(x) of a pool once it's been created? I have an 8-drive pool that is also a Z1. Will this be an issue in the future?
 

Daisuke

Contributor
Joined
Jun 23, 2011
Messages
1,041
We went with z1 more for available storage in the pool.
It depends how many disks you use, for 6-12 disks I strongly recommend to use RaidZ2. You will not be happy when you lose all data just because you feel like gaining few terabytes. Is common to have another disk fail while resilvering another. It only happened once to me, to have another disk fail during resilvering another. It was pretty stressful until first disk finished. If another disk would’ve failed during that resilver process, bye bye all data. I always keep 3 spare disks in hand, for situations like that. I always buy disks from different vendors to avoid same batch.
Is there any way to change the RaidZ(x) of a pool once it's been created?
No. Also, you cannot just remove disks, you need to destroy the pool first, unless you take a disk offline to replace it. Once you create the pool, it dies with that number of disks, until you destroy it. You can replace the disks to increase the pool storage size but you cannot expand/contract the number of disks. That’s why is common to create small vdevs of 6 disks or so. I always create 12 disk pools and vdevs, is the border line for CMR disks, when it comes time to resilver a failed disk. :smile:

Back to our discussion, after you format the disk, when you create a new pool with one disk only, it will give you only one option, so you cannot make mistakes.

BTW, did I missed something, you solved the issue?
 
Last edited:

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Is there any way to change the RaidZ(x) of a pool once it's been created? I have an 8-drive pool that is also a Z1. Will this be an issue in the future?

You cannot change it, have to destroy and make a new pool.

But to emphasize again, so many think zfs = old Raid, not this way. What can happen in a error situation, if you can find that data cannot be rebuilt due to multiple drive errors (which rarely happens esp with weekly scrubbing), is zfs will tell you the files that are suspect and may have errors. Often, you can still read the data off except for perhaps a few possibly corrupt files. Assuming decent enterprise drives with tler etc, a single ure does not abort the resilver and will not result in the loss of a pool. It is so different than the old Raid days. Many many examples if you follow the zfs subreddit. Very few instances of unrecoverable pools. And like anything, you will of course hear about any failures far more than the successes.

Sure if a drive totally fails and won't even power on, that's a different case. The odds of 2 at the same time, not very high. But the advice to not use Z1 with this many drives is solid.

Glad to see it's (seemingly) working. Would love to hear back in a week or so if all is well.
 
Last edited:

indivision

Guru
Joined
Jan 4, 2013
Messages
806
Any of the changes made could have fixed the issue in theory. But, it doesn't seem to me that this has been verified.

I would re-build the pool with Z2 and run tests/scrub to see if any of those errors come back before moving forward with leaning on the system.
 

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
It depends how many disks you use, for 6-12 disks I strongly recommend to use RaidZ2. You will not be happy when you lose all data just because you feel like gaining few terabytes. Is common to have another disk fail while resilvering another. It only happened once to me, to have another disk fail during resilvering another. It was pretty stressful until first disk finished. Of another disk would’ve failed, bye bye all data. I always keep 3 spare disks in hand, for situations like that. I always buy disks from different vendors to avoid same batch.

No. Also, you cannot just remove disks, you need to destroy the pool first, unless you take a disk offline to replace it. Once you create the pool, it dies with that number of disks, until you destroy it. You can replace the disks to increase the pool storage size but you cannot expand/contract the number of disks. That’s why is common to create small vdevs of 6 disks or so. I always create 12 disk pools and vdevs, is the border line for CMR disks, when it comes time to resilver a failed disk. :smile:

Back to our discussion, after you format the disk, when you create a new pool with one disk only, it will give you only one option, so you cannot make mistakes.

BTW, did I missed something, you solved the issue?
After talking with my supervisor our plan is to create the pool from scratch with new drives of larger size (4TB) so that we can do a Z2 and get similar available space.

What burn-in tests would you recommend I do on the new pool?

Should I still do the isolation test on the new drive that I put in, assuming I test with a single drive pool?
 

Demonlinx

Explorer
Joined
Apr 11, 2022
Messages
53
Any of the changes made could have fixed the issue in theory. But, it doesn't seem to me that this has been verified.

I would re-build the pool with Z2 and run tests/scrub to see if any of those errors come back before moving forward with leaning on the system.
As stated in my previous reply we're going to rebuild the pool with larger drives. We're going to go with Z2 drives. What tests/scrub should I perform on these disks? Should these tests be done prior to building them into a pool?
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
I always do a long smartctl test for any drive before I use it in any manner.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Of course. All SMART tests are non-destructive, and should be scheduled as regular tasks in TrueNAS to monitor the drives.
 
Top