SOLVED [SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)

StefLag67 · Mar 22, 2021

HolyK said:
you might consider pausing the resilver process, removing the faulty disk and let it resilver only to the new one.

@HolyK
all right - would you mind precisely explaining the procedure? I don't want to mess up anything... TIA

HolyK · Mar 22, 2021

jgreco said:
Just a note, resilvering "from" the first one is not the way resilvering works. ZFS walks all its blocks and will use any valid copy of a block it is able to find. It is almost certainly copying valid data from the fail-y drive to the new drive along the way.

Am i reading the output wrong? To me it seems like the second drive has damaged data as well so it is trying to get the proper data from disk one. That second disk is (almost) dead anyway so why bother?

You have a valid point regarding "reading data from first and second disk in order to resilver the third one" BUT what happens when it tries to read something FROM the second/faulty drive? Will it "give up" quickly or will it spend significant amount of time before it "gives up" becuase that sector is unreadable/slow-as-hell ? Where is the moment ZFS says "that's enough" and kicks the whole drive out of pool...

@StefLag67 What is the remaining time ?

StefLag67 · Mar 22, 2021

@HolyK about 30 hours or so
Sometimes the progress is rather fast, Sometimes it's super slow and remaining time increases
About 33% completed

HolyK · Mar 22, 2021

@StefLag67 In that case i would let it run. 30hrs is acceptable i guess.

jgreco · Mar 22, 2021

HolyK said:
Am i reading the output wrong? To me it seems like the second drive has damaged data as well so it is trying to get the proper data from disk one. That second disk is (almost) dead anyway so why bother?

You have a valid point regarding "reading data from first and second disk in order to resilver the third one" BUT what happens when it tries to read something FROM the second/faulty drive? Will it "give up" quickly or will it spend significant amount of time before it "gives up" becuase that sector is unreadable/slow-as-hell ? ... That was my point...

My point is that you were describing what I suspect was how you would imagine or maybe prefer it to work, but not how it actually works.

ZFS will read both (actually "all") sides of a mirror and verify that the checksums are correct. ZFS has very little control over what the drive chooses to do. See the article

https://www.truenas.com/community/threads/checking-for-tler-erc-etc-support-on-a-drive.27126/

So if ZFS is reading a block and one of the component drives stalls to try a minute-long recovery process, then, yes, you are waiting the minute.

Now if ZFS reads a block and does not get what it expects, then ZFS will issue a write to the block with the correct data. This could also take time, and again, it is drive-dependent.

In a hardware RAID array, because there is redundancy, the drives provided by the vendor typically have TLER/ERC tuned to "give up" quickly so that everyone can get on with their lives. With ZFS and FreeNAS, it is up to you to understand these things.

In your first paragraph I've quoted, you seem to be assuming that ZFS is going to be keeping some sort of scorecard as to the health of the drives, and will "learn" to ignore the second drive. It has a grasp of the health, which is reflected in the error stats, but what happens when a block on the first drive is bad? The process that ZFS is actually doing is seeing if it can repair the "bad" disk. It doesn't really know or care that we are ultimately in a situation where that disk will be pulled and discarded. There's no ZFS function for that. The closest thing would be to pull the disk BEFORE beginning the rebuild, but then you lose redundancy, which is bad. The way it is being done, it's less likely for data to be lost.

HolyK · Mar 22, 2021

@jgreco Point taken and thanks for details. I did not know how long it takes. And if it is a minute then well ... if it waits for a minute few times that is OK but if it waits a minute every second time it wants to read something ... where is the border how long end user waits? It depends on several things but personally if i would see "it will take like a month or a decade" then i would just rip the faulty HDD out and hope that the other one will survive full resilver to new disk. Yes there could be a bad block on the first disk and maybe the same data might be still intact on the other (failing) disk but hey ... my patience is limited. And those 7000days from the post on the first page ... well i am thinking what could have so much value to wait that long and not having a backup of it ...

Anyway for the second part ... What i meant by using "Disk1" and "Disk2" was related to the actually corrupted/broken data on the faulty drive ... So if i have the same data on two physical devices (and only two not three or four...) and the data gets corrupted on one of them (HDD2) then from where ZFS reads the "correct" data if not from the "other" disk? That's what i meant. Ok i did not know that ZFS does not know/care about physical disk from where it reads the data (thanks for the explanation) but in this particular case it does not matter much. If i know i have two HDDs in mirror and one of them has corrupted data then logically ZFS has to read the proper ones and zeroes from the other one, right (even it does not know/care about that)? And here i am coming back to the previous question ... will i wait or will i just remove the faulty HDD (i don't need to fix the blocks there as i will scrap it anyway).

Anyway yes I thought there is some "line" where ZFS says "screw this, i am not reading this garbage anymore, there is no hope...". Not necessarly TLER way but something more "smart?". Apparently there is no such thing unless whole disk dies. So it is up to "us" to decide how long we will let ZFS try-hard to fix whatever is broken. Thanks for clarification :)

StefLag67 · Mar 23, 2021

thanks you both for the great insights. Amazing to see such a community support.

Progress is slow but real :)



root@truenas[~]# zpool status -v Mirror3TB
  pool: Mirror3TB
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Mar 22 09:36:16 2021
        1.16T scanned at 13.2M/s, 1.16T issued at 13.1M/s, 2.02T total
        1.16T resilvered, 57.20% done, 19:13:03 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        Mirror3TB                                       ONLINE       0     0 0
          mirror-0                                      ONLINE       0     0 0
            gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d  ONLINE       0     0 0
            gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d  ONLINE       3     0 4  (resilvering)
            gptid/1c5bcd47-8ae8-11eb-82b9-c86000c2238d  ONLINE       0     0 0  (resilvering)

errors: No known data errors

Etorix · Mar 23, 2021

HolyK said:
Anyway yes I thought there is some "line" where ZFS says "screw this, i am not reading this garbage anymore, there is no hope...". Not necessarly TLER way but something more "smart?". Apparently there is no such thing unless whole disk dies. So it is up to "us" to decide how long we will let ZFS try-hard to fix whatever is broken. Thanks for clarification :)

ZFS basically trusts the disks to do whatever is appropriate to read data. NAS/Enterprise drives will only try for so long before reporting a fault to their controller, so ZFS would work as you expect and get valid data from other drives. Consumer drives will try their utmost to provide the user with its (presumably important and non-redundant) data and can take minutes on every faulty sector to salvage what can be salvaged. All of this is perfectly reasonable under its own assumptions.
The issue lies with using consumer drives with an enterprise-minded storage system. Hopefully at this point the estimated time to complete is a worst case.

jgreco · Mar 23, 2021

Etorix said:
ZFS basically trusts the disks to do whatever is appropriate to read data. NAS/Enterprise drives will only try for so long before reporting a fault to their controller, so ZFS would work as you expect and get valid data from other drives. Consumer drives will try their utmost to provide the user with its (presumably important and non-redundant) data and can take minutes on every faulty sector to salvage what can be salvaged. All of this is perfectly reasonable under its own assumptions.
The issue lies with using consumer drives with an enterprise-minded storage system. Hopefully at this point the estimated time to complete is a worst case.

^^^^^ That. Nicely stated.

StefLag67 · Mar 24, 2021

Hi there,
Quick update with rather good news.
After almost 2 days, the resilvering has been successfully completed without any errors this morning!

I then detached the faulty drive, removed it physically and put the new 6TB instead.
I've extended the pool with it and the resilvering has started - obviously in a much faster manner:



root@truenas[~]# zpool status -v Mirror3TB
  pool: Mirror3TB
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Mar 24 13:43:45 2021
        345G scanned at 728M/s, 60.9G issued at 129M/s, 2.02T total
        60.9G resilvered, 2.94% done, 04:26:48 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        Mirror3TB                                       ONLINE       0     0 0
          mirror-0                                      ONLINE       0     0 0
            gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d  ONLINE       0     0 0
            gptid/1c5bcd47-8ae8-11eb-82b9-c86000c2238d  ONLINE       0     0 0
            gptid/91c01e19-8c9e-11eb-9d42-c86000c2238d  ONLINE       0     0 0  (resilvering)

errors: No known data errors

Thanks again to everyone, you guys rock!

StefLag67 · Mar 24, 2021

jgreco said:
Once again, add that to the 3TB mirror, let it resilver. Tell ZFS to remove the final 3TB drive, and suddenly the pool will become a 6TB pool.

I think you can handle it from there.

AWESOME! worked like a charm!
So happy :)
Thanks a million!

Important Announcement for the TrueNAS Community.

SOLVED [SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)

StefLag67

Dabbler

HolyK

Ninja Turtle

StefLag67

Dabbler

HolyK

Ninja Turtle

jgreco

Resident Grinch

HolyK

Ninja Turtle

StefLag67

Dabbler

Etorix

Wizard

jgreco

Resident Grinch

StefLag67

Dabbler

StefLag67

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

SOLVED [SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)

Dabbler

Ninja Turtle

Dabbler

Ninja Turtle

Resident Grinch

Ninja Turtle

Dabbler

Wizard

Resident Grinch

Dabbler

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "[SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)"

Similar threads