SOLVED [SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
Hi there - beware, I'm a total newbie so please excuse stupid questions :)

Ok, so here's the deal:
I have a PC with currently 7 hard drives, knowing that the motherboard has a total of 8 SATA connectors.
3x 1TB in a ZFS pool (somehow important data). the 3 disks are pretty old.
2x 3TB in a mirror pool (important data)
2x 4TB in a 8TB strip pool (not really important data)

I have also 2x 6TB brand new drives I just purchased.

The main issue currently is on the 3TB mirror, where one of the drives shows bad sectors since a few days and I guess it will soon die. I have shut down the server completely for now until I find (hopefully here) a good approach.
Ideally, I'd like to make a 6TB mirror and copy on it the content of both the 3TB mirror and the 3x 1TB pool, so that at the end I have the 6TB mirror (important data) and the 8TB strip (non-important data).

What would you guys advise to do?

Many thanks!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Hi there - beware, I'm a total newbie so please excuse stupid questions :)

Well, the truly stupid question is the one you don't ask that ends up ruining your data because you didn't ask.

Ok, so here's the deal:
I have a PC with currently 7 hard drives, knowing that the motherboard has a total of 8 SATA connectors.
3x 1TB in a ZFS pool (somehow important data). the 3 disks are pretty old.
2x 3TB in a mirror pool (important data)
2x 4TB in a 8TB strip pool (not really important data)

I have also 2x 6TB brand new drives I just purchased.

The main issue currently is on the 3TB mirror, where one of the drives shows bad sectors since a few days and I guess it will soon die. I have shut down the server completely for now until I find (hopefully here) a good approach.
Ideally, I'd like to make a 6TB mirror and copy on it the content of both the 3TB mirror and the 3x 1TB pool, so that at the end I have the 6TB mirror (important data) and the 8TB strip (non-important data).

What would you guys advise to do?

Many thanks!

Okay, this isn't horrible. I'm taking "currently 7" "total of 8" to mean you can put one more drive in.

Add one of your brand new 6TB drives. After you make *SURE* it is not an SMR drive, that is.

Widen your 2x3TB mirror pool onto it, making it a three-way mirror. This means that ZFS will copy all the data from your 3TB mirror onto the 6TB drive, making it part of the mirror. If there's a bad sector on one disk, it will get it from the other disk. No data loss.

Once that is resilvered and happy, tell ZFS to remove the fail-y 3TB drive from the mirror. Once it is happy with that, power down, remove the drive, and then put your other 6TB in that slot.

Once again, add that to the 3TB mirror, let it resilver. Tell ZFS to remove the final 3TB drive, and suddenly the pool will become a 6TB pool.

I think you can handle it from there.
 

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
Widen your 2x3TB mirror pool onto it, making it a three-way mirror

Ok, want to make sure I do the right thing.
I clicked on 'Pool operations" on my 'Mirror3TB', and then selected "Add Vdevs". But at this stage it tells me that I can't do that:
"WARNING: Adding data vdevs with different numbers of disks is not recommended. First vdev has 2 disks, new vdev has 1. "

what am I doing wrong?
 

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
all right, using this
and this

it seems I would have to do this:
zpool attach Mirror3TB /dev/gptid/[gptid_of_the_2nd_existing_disk] /dev/gptid/[gptid_of_the_new_disk]

to find the GPTID of the 2nd existing disk, I would do
zpool status Mirror3TB
and look at the gptid of the disk at the bottom

to find the GPTID of the new disk,
glabel status

am I all good before I move on?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
The steps would be correct to do it on the CLI but it is preferred to do it on the GUI. You're not adding a vdev, just widening the existing 2-way mirror vdev to 3-way.

Go to Storage>Pool, from the gear select "Status" and then "Extend" (or "Replace") from the 3-dot menu. FreeNAS will handle gptids for you.
 

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
Screenshot 2021-03-22 093946.jpg


holly sh**!! See you guys in 19 years
 

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
@StefLag67 Erwwwww ... i smell SMR drives? Can you give us the HDD model numbers ?
 

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
Hi HolyK, I don't think so, it's a WDC WD60EFRX-68L0BN1

my 3TB are WDC WD30EZRX-00DC0B0 and WDC WD30EZRX-00D8PB0, though
 

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
@StefLag67 Ok so the WD60EFRX-??L0BN? is CMR and the WD30EZRX are actually a WD GREENs (So CMR as well).

What is the current time remaining? If it is still very long it could be that the faulty HDD is simply dying and the resilver process is desperately trying to read the data from faulty HDD. In situation like this the speed can be a veeeeery slow.

So ideally give us output of smartctl -a /dev/xxxx for both (all three?) drives as well as zpool status <poolname> and maybe also the verbose output zpool status -v <poolname>.
 
Last edited:

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
@StefLag67 Ok so the WD60EFRX-??L0BN? is CMR and the WD30EZRX are actually a WD GREENs (So CMR as well).

What is the current time remaining? If it is still very long it could be that the faulty HDD is simply dying and the resilver process is desperately trying to read the data from faulty HDD. In situation like this the speed can be a veeeeery slow.
well it got better, so to speak, as the remaining time is "only" 1 day and 8 hours. Interestingly, time to time it goes much faster but it is indeed very slow.

So ideally give us output of smartctl -a /dev/xxxx for both (all three?) drives as well as zpool status <poolname> and maybe also the verbose output zpool status -v <poolname>.

Here is the pool status:

root@truenas[~]# zpool status -v Mirror3TB
pool: Mirror3TB
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Mar 22 09:36:16 2021
606G scanned at 21.1M/s, 408G issued at 14.2M/s, 2.02T total
409G resilvered, 19.70% done, 1 days 09:18:54 to go
606G scanned at 21.1M/s, 408G issued at 14.2M/s, 2.02T total
409G resilvered, 19.70% done, 1 days 09:18:54 to go
config:

NAME STATE READ WRITE CKSUM
Mirror3TB ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d ONLINE 0 0 0
gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d ONLINE 3 0 4 (resilvering)

for the HDD, see the attached file.

Many thanks!
 

Attachments

  • 3TB drives.txt
    14.9 KB · Views: 191

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
Ok so the "WD30EZRX-00D8PB0" is wasted that is clear. Can you give output of glabel status

Also did you used "extend" or "replace" in the GUI when adding the 6TB disk?
 

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
@StefLag67 Uhm ... plaintext pleaseeeeee (uhm why do people do this ... posting screenshots of pure text :/)

One more output of this please : camcontrol devlist
 

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
@StefLag67 Uhm ... plaintext pleaseeeeee (uhm why do people do this ... posting screenshots of pure text :/)

One more output of this please : camcontrol devlist

didn't manage to copy the whole plain text, apologies... here you go:

Code:
root@truenas[~]# glabel status
Name  Status  Components
gptid/584648eb-6333-11eb-be68-c86000c2238d     N/A  ada0p2
gptid/58726038-6333-11eb-be68-c86000c2238d     N/A  ada1p2
gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d     N/A  ada2p2
gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d     N/A  ada3p2
gptid/acd4121e-5a7b-11eb-a2e1-c86000c2238d     N/A  ada4p2
gptid/ac5a7d31-5a7b-11eb-a2e1-c86000c2238d     N/A  ada6p2
gptid/2d820b9c-5b24-11eb-97b8-c86000c2238d     N/A  ada7p2
gptid/84437f12-5a71-11eb-b817-c86000c2238d     N/A  da0p1
gptid/84a9e152-5a71-11eb-b817-c86000c2238d     N/A  da0p2
gptid/1c5bcd47-8ae8-11eb-82b9-c86000c2238d     N/A  ada5p2
gptid/583668a7-6333-11eb-be68-c86000c2238d     N/A  ada1p1
gptid/581693c8-6333-11eb-be68-c86000c2238d     N/A  ada0p1

root@truenas[~]# camcontrol devlist
<ST4000VN008-2DR166 SC60>          at scbus2 target 0 lun 0 (pass0,ada0)
<ST4000VN008-2DR166 SC60>          at scbus3 target 0 lun 0 (pass1,ada1)
<WDC WD30EZRX-00DC0B0 80.00A80>    at scbus4 target 0 lun 0 (pass2,ada2)
<WDC WD30EZRX-00D8PB0 80.00A80>    at scbus5 target 0 lun 0 (pass3,ada3)
<ST31000340NS SN03>                at scbus6 target 0 lun 0 (pass4,ada4)
<WDC WD60EFRX-68L0BN1 82.00A82>    at scbus7 target 0 lun 0 (pass5,ada5)
<Hitachi HDS721010CLA332 JP4OA3GH>  at scbus8 target 0 lun 0 (pass6,ada6)
<WDC WD10EZEX-60WN4A0 01.01A01>    at scbus9 target 0 lun 0 (pass7,ada7)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus10 target 0 lun 0 (pass8,ses0)
<Generic- SD/MMC 1.00>             at scbus12 target 0 lun 0 (da0,pass9)
 
Last edited by a moderator:

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
Thanks ... I am a bit confused with the zpool output.

The new 6TB is ada5 (WD60EFRX-68L0BN1)...

And as per your zpool status
gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d ONLINE 0 0 0
// This one is ada2 (WD30EZRX-00DC0B0) which has OK SMART

gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d ONLINE 3 0 4 (resilvering)
// And this is the faulty ada3 (WD30EZRX-00D8PB0) with bad SMART stats

So i wonder why the ada3 shows as resilvering ? I don't remember exactly how the zpool output looks like while resilver is in progress so it could be that it refreshes after it is done but your output is so confusing. It looks like it actually resilvering the ada3 from ada2 :oops:
 

StefLag67

Dabbler
Joined
Mar 21, 2021
Messages
16
You scare me
I hope nothing will go wrong with the data

root@truenas[~]# zpool status -v Mirror3TB pool: Mirror3TB state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Mon Mar 22 09:36:16 2021 646G scanned at 14.0M/s, 606G issued at 13.1M/s, 2.02T total 608G resilvered, 29.29% done, 1 days 07:41:51 to go config: NAME STATE READ WRITE CKSUM Mirror3TB ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d ONLINE 0 0 0 gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d ONLINE 3 0 4 (resilvering) gptid/1c5bcd47-8ae8-11eb-82b9-c86000c2238d ONLINE 0 0 0 (resilvering) errors: No known data errors

sounds 2 of them are
 

HolyK

Ninja Turtle
Moderator
Joined
May 26, 2011
Messages
654
Ok now it looks "better". The third disk was not in the previous post from you. Anyway now it looks like the faulty disk AND the new one are resilvering from the first one. Well looking at the numbers of pending/Uncorrectable (1168/1191) sectors that disk is basically ready for silicon heaven. And it might be actually slowing down the whole process.

What is the remaining time? If it is still crazy high you might consider pausing the resilver process, removing the faulty disk and let it resilver only to the new one. Yes it will put you into temporary single-point-of-failure scenario but the second disk is dead anyway so if the first one dies as well (hope not) i am not sure how much data you would be able to recover anyway. Also on the other hand the first disk is under heavy load right now. So finishing the resilver sooner is better than letting it run under that load for days and risking another failure...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Just a note, resilvering "from" the first one is not the way resilvering works. ZFS walks all its blocks and will use any valid copy of a block it is able to find. It is almost certainly copying valid data from the fail-y drive to the new drive along the way.
 
Top