SOLVED [SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)

StefLag67 · Mar 21, 2021

Hi there - beware, I'm a total newbie so please excuse stupid questions :)

Ok, so here's the deal:
I have a PC with currently 7 hard drives, knowing that the motherboard has a total of 8 SATA connectors.
3x 1TB in a ZFS pool (somehow important data). the 3 disks are pretty old.
2x 3TB in a mirror pool (important data)
2x 4TB in a 8TB strip pool (not really important data)

I have also 2x 6TB brand new drives I just purchased.

The main issue currently is on the 3TB mirror, where one of the drives shows bad sectors since a few days and I guess it will soon die. I have shut down the server completely for now until I find (hopefully here) a good approach.
Ideally, I'd like to make a 6TB mirror and copy on it the content of both the 3TB mirror and the 3x 1TB pool, so that at the end I have the 6TB mirror (important data) and the 8TB strip (non-important data).

What would you guys advise to do?

Many thanks!

jgreco · Mar 21, 2021

StefLag67 said:
Hi there - beware, I'm a total newbie so please excuse stupid questions :)

Well, the truly stupid question is the one you don't ask that ends up ruining your data because you didn't ask.

Ok, so here's the deal:
I have a PC with currently 7 hard drives, knowing that the motherboard has a total of 8 SATA connectors.
3x 1TB in a ZFS pool (somehow important data). the 3 disks are pretty old.
2x 3TB in a mirror pool (important data)
2x 4TB in a 8TB strip pool (not really important data)

I have also 2x 6TB brand new drives I just purchased.

The main issue currently is on the 3TB mirror, where one of the drives shows bad sectors since a few days and I guess it will soon die. I have shut down the server completely for now until I find (hopefully here) a good approach.
Ideally, I'd like to make a 6TB mirror and copy on it the content of both the 3TB mirror and the 3x 1TB pool, so that at the end I have the 6TB mirror (important data) and the 8TB strip (non-important data).

What would you guys advise to do?

Many thanks!

Okay, this isn't horrible. I'm taking "currently 7" "total of 8" to mean you can put one more drive in.

Add one of your brand new 6TB drives. After you make *SURE* it is not an SMR drive, that is.

Widen your 2x3TB mirror pool onto it, making it a three-way mirror. This means that ZFS will copy all the data from your 3TB mirror onto the 6TB drive, making it part of the mirror. If there's a bad sector on one disk, it will get it from the other disk. No data loss.

Once that is resilvered and happy, tell ZFS to remove the fail-y 3TB drive from the mirror. Once it is happy with that, power down, remove the drive, and then put your other 6TB in that slot.

Once again, add that to the 3TB mirror, let it resilver. Tell ZFS to remove the final 3TB drive, and suddenly the pool will become a 6TB pool.

I think you can handle it from there.

StefLag67 · Mar 21, 2021

would have never thought about that! Many thanks!

StefLag67 · Mar 21, 2021

jgreco said:
Widen your 2x3TB mirror pool onto it, making it a three-way mirror

Ok, want to make sure I do the right thing.
I clicked on 'Pool operations" on my 'Mirror3TB', and then selected "Add Vdevs". But at this stage it tells me that I can't do that:
"WARNING: Adding data vdevs with different numbers of disks is not recommended. First vdev has 2 disks, new vdev has 1. "

what am I doing wrong?

StefLag67 · Mar 21, 2021

all right, using this

https://docs.oracle.com/cd/E19253-01/819-5461/gcfhe/index.html

and this

SOLVED - Adding a drive and using zpool to change existing stripe to mirror

Hi all, I currently have freenas 9.2.1.5 with 2x2tb on mirror and a 1x1tb stripe as shown below pool: gohvault state: ONLINE scan: scrub repaired 0 in 1h9m with 0 errors on Tue Dec 30 07:10:42 2014 config: NAME STATE READ WRITE CKSUM gohvault ONLINE 0 0 0...

www.truenas.com

it seems I would have to do this:
zpool attach Mirror3TB /dev/gptid/[gptid_of_the_2nd_existing_disk] /dev/gptid/[gptid_of_the_new_disk]

to find the GPTID of the 2nd existing disk, I would do
zpool status Mirror3TB
and look at the gptid of the disk at the bottom

to find the GPTID of the new disk,
glabel status

am I all good before I move on?

Etorix · Mar 22, 2021

The steps would be correct to do it on the CLI but it is preferred to do it on the GUI. You're not adding a vdev, just widening the existing 2-way mirror vdev to 3-way.

Go to Storage>Pool, from the gear select "Status" and then "Extend" (or "Replace") from the 3-dot menu. FreeNAS will handle gptids for you.

StefLag67 · Mar 22, 2021

Awesome! couldn't find where this was on the GUI.

StefLag67 · Mar 22, 2021

holly sh**!! See you guys in 19 years

HolyK · Mar 22, 2021

@StefLag67 Erwwwww ... i smell SMR drives? Can you give us the HDD model numbers ?

StefLag67 · Mar 22, 2021

Hi HolyK, I don't think so, it's a WDC WD60EFRX-68L0BN1

my 3TB are WDC WD30EZRX-00DC0B0 and WDC WD30EZRX-00D8PB0, though

HolyK · Mar 22, 2021

@StefLag67 Ok so the WD60EFRX-??L0BN? is CMR and the WD30EZRX are actually a WD GREENs (So CMR as well).

What is the current time remaining? If it is still very long it could be that the faulty HDD is simply dying and the resilver process is desperately trying to read the data from faulty HDD. In situation like this the speed can be a veeeeery slow.

So ideally give us output of smartctl -a /dev/xxxx for both (all three?) drives as well as zpool status <poolname> and maybe also the verbose output zpool status -v <poolname>.

StefLag67 · Mar 22, 2021

HolyK said:
@StefLag67 Ok so the WD60EFRX-??L0BN? is CMR and the WD30EZRX are actually a WD GREENs (So CMR as well).

What is the current time remaining? If it is still very long it could be that the faulty HDD is simply dying and the resilver process is desperately trying to read the data from faulty HDD. In situation like this the speed can be a veeeeery slow.

well it got better, so to speak, as the remaining time is "only" 1 day and 8 hours. Interestingly, time to time it goes much faster but it is indeed very slow.

So ideally give us output of smartctl -a /dev/xxxx for both (all three?) drives as well as zpool status <poolname> and maybe also the verbose output zpool status -v <poolname>.

Here is the pool status:

root@truenas[~]# zpool status -v Mirror3TB
pool: Mirror3TB
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Mar 22 09:36:16 2021
606G scanned at 21.1M/s, 408G issued at 14.2M/s, 2.02T total
409G resilvered, 19.70% done, 1 days 09:18:54 to go
606G scanned at 21.1M/s, 408G issued at 14.2M/s, 2.02T total
409G resilvered, 19.70% done, 1 days 09:18:54 to go
config:

NAME STATE READ WRITE CKSUM
Mirror3TB ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d ONLINE 0 0 0
gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d ONLINE 3 0 4 (resilvering)

for the HDD, see the attached file.

Many thanks!

HolyK · Mar 22, 2021

Ok so the "WD30EZRX-00D8PB0" is wasted that is clear. Can you give output of glabel status

Also did you used "extend" or "replace" in the GUI when adding the 6TB disk?

StefLag67 · Mar 22, 2021

HolyK said:
Ok so the "WD30EZRX-00D8PB0" is wasted that is clear. Can you give output of glabel status

Also did you used "extend" or "replace" in the GUI when adding the 6TB disk?

I used "extend".

HolyK · Mar 22, 2021

@StefLag67 Uhm ... plaintext pleaseeeeee (uhm why do people do this ... posting screenshots of pure text :/)

One more output of this please : camcontrol devlist

StefLag67 · Mar 22, 2021

HolyK said:
@StefLag67 Uhm ... plaintext pleaseeeeee (uhm why do people do this ... posting screenshots of pure text :/)

One more output of this please : camcontrol devlist

didn't manage to copy the whole plain text, apologies... here you go:

Code:

root@truenas[~]# glabel status
Name  Status  Components
gptid/584648eb-6333-11eb-be68-c86000c2238d     N/A  ada0p2
gptid/58726038-6333-11eb-be68-c86000c2238d     N/A  ada1p2
gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d     N/A  ada2p2
gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d     N/A  ada3p2
gptid/acd4121e-5a7b-11eb-a2e1-c86000c2238d     N/A  ada4p2
gptid/ac5a7d31-5a7b-11eb-a2e1-c86000c2238d     N/A  ada6p2
gptid/2d820b9c-5b24-11eb-97b8-c86000c2238d     N/A  ada7p2
gptid/84437f12-5a71-11eb-b817-c86000c2238d     N/A  da0p1
gptid/84a9e152-5a71-11eb-b817-c86000c2238d     N/A  da0p2
gptid/1c5bcd47-8ae8-11eb-82b9-c86000c2238d     N/A  ada5p2
gptid/583668a7-6333-11eb-be68-c86000c2238d     N/A  ada1p1
gptid/581693c8-6333-11eb-be68-c86000c2238d     N/A  ada0p1

root@truenas[~]# camcontrol devlist
<ST4000VN008-2DR166 SC60>          at scbus2 target 0 lun 0 (pass0,ada0)
<ST4000VN008-2DR166 SC60>          at scbus3 target 0 lun 0 (pass1,ada1)
<WDC WD30EZRX-00DC0B0 80.00A80>    at scbus4 target 0 lun 0 (pass2,ada2)
<WDC WD30EZRX-00D8PB0 80.00A80>    at scbus5 target 0 lun 0 (pass3,ada3)
<ST31000340NS SN03>                at scbus6 target 0 lun 0 (pass4,ada4)
<WDC WD60EFRX-68L0BN1 82.00A82>    at scbus7 target 0 lun 0 (pass5,ada5)
<Hitachi HDS721010CLA332 JP4OA3GH>  at scbus8 target 0 lun 0 (pass6,ada6)
<WDC WD10EZEX-60WN4A0 01.01A01>    at scbus9 target 0 lun 0 (pass7,ada7)
<AHCI SGPIO Enclosure 2.00 0001>   at scbus10 target 0 lun 0 (pass8,ses0)
<Generic- SD/MMC 1.00>             at scbus12 target 0 lun 0 (da0,pass9)

HolyK · Mar 22, 2021

Thanks ... I am a bit confused with the zpool output.

The new 6TB is ada5 (WD60EFRX-68L0BN1)...

And as per your zpool status
gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d ONLINE 0 0 0
// This one is ada2 (WD30EZRX-00DC0B0) which has OK SMART

gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d ONLINE 3 0 4 (resilvering)
// And this is the faulty ada3 (WD30EZRX-00D8PB0) with bad SMART stats

So i wonder why the ada3 shows as resilvering ? I don't remember exactly how the zpool output looks like while resilver is in progress so it could be that it refreshes after it is done but your output is so confusing. It looks like it actually resilvering the ada3 from ada2

StefLag67 · Mar 22, 2021

You scare me
I hope nothing will go wrong with the data



root@truenas[~]# zpool status -v Mirror3TB
  pool: Mirror3TB
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Mar 22 09:36:16 2021
        646G scanned at 14.0M/s, 606G issued at 13.1M/s, 2.02T total
        608G resilvered, 29.29% done, 1 days 07:41:51 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        Mirror3TB                                       ONLINE       0     0 0
          mirror-0                                      ONLINE       0     0 0
            gptid/9419f4c7-5e82-11eb-9e20-c86000c2238d  ONLINE       0     0 0
            gptid/942e4f7e-5e82-11eb-9e20-c86000c2238d  ONLINE       3     0 4  (resilvering)
            gptid/1c5bcd47-8ae8-11eb-82b9-c86000c2238d  ONLINE       0     0 0  (resilvering)

errors: No known data errors

sounds 2 of them are

HolyK · Mar 22, 2021

Ok now it looks "better". The third disk was not in the previous post from you. Anyway now it looks like the faulty disk AND the new one are resilvering from the first one. Well looking at the numbers of pending/Uncorrectable (1168/1191) sectors that disk is basically ready for silicon heaven. And it might be actually slowing down the whole process.

What is the remaining time? If it is still crazy high you might consider pausing the resilver process, removing the faulty disk and let it resilver only to the new one. Yes it will put you into temporary single-point-of-failure scenario but the second disk is dead anyway so if the first one dies as well (hope not) i am not sure how much data you would be able to recover anyway. Also on the other hand the first disk is under heavy load right now. So finishing the resilver sooner is better than letting it run under that load for days and risking another failure...

jgreco · Mar 22, 2021

Just a note, resilvering "from" the first one is not the way resilvering works. ZFS walks all its blocks and will use any valid copy of a block it is able to find. It is almost certainly copying valid data from the fail-y drive to the new drive along the way.

Important Announcement for the TrueNAS Community.

SOLVED [SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)

Dabbler

Resident Grinch

Dabbler

Dabbler

Dabbler

Wizard

Dabbler

Dabbler

Ninja Turtle

Dabbler

Ninja Turtle

Dabbler

Attachments

Ninja Turtle

Dabbler

Ninja Turtle

Dabbler

Ninja Turtle

Dabbler

Ninja Turtle

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "[SOLVED] Change fail-y HDD in a Mirror (move 2-way to 3-way then to 2-way again)"

Similar threads