how to manually replace disk in scale from CLI with Scale?

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
I had a minor scare a few minutes ago. I had shutdown my Truenas Scale box in order to replace the HBA (sas2->SAS3). With everything connected back up, i turned the box back on...to find a vdev with 2 faulted drives in it. I have 24 drives, in 4x 6-drive vdevs using RAIDZ2, so any further failures in the vdev would've been bad. The issue was that some of my drives show up as "/dev/sdX" and some show up as GPTID's (this is not my doing, in truenas when I did disk replacements migrating from 8TB->14TB some got swaped for sdX names instead of GPTID's.

Anyway, the failure said that 2 drives USED to be /dev/sdf2 and /dev/sdj2. I want to prevent this from happening in the future, so I'd like to switch all my messed up labels to GPTID's...but dont know how.

I've run blkid to get the partition UUIDs I need, however when I try offlining a disk, say /dev/sda2, then run

Code:
zpool replace tank /dev/sda2 gptid/abcd1234efgh9876


I get a message saying

cannot open 'gptid/75a78de5-a710-11ec-ac8c-90e2ba266fca': no such device in /dev
must be a full path or shorthand device name
I can see in /dev/disk/by-partuuid all the id's there, so should I change gptid to partuuid/xyz?

This is what my array currently looks like:

Code:
 raidz2-0                                ONLINE       0     0     0
            sda2                                  ONLINE       0     0     0  (awaiting resilver)
            3c871a17-b77c-11ec-ac8c-90e2ba266fca  ONLINE       0     0     0
            b52e2217-ee76-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            64c43df0-ef36-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            f5305f13-e82f-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            57481e93-e69a-11ec-8426-90e2ba266fca  ONLINE       0     0     0
          raidz2-1                                ONLINE       0     0     0
            b77b5a07-f62d-486a-bc6d-7560f5205bac  ONLINE       0     0     0  (resilvering)
            c3848061-ef38-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            5cda7b92-ee77-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            67090be1-473a-4007-bd4e-535dfb7bcedb  ONLINE       0     0     0  (resilvering)
            96db2c9e-e830-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            cb58e481-e6ab-11ec-8426-90e2ba266fca  ONLINE       0     0     0
          raidz2-2                                ONLINE       0     0     0
            a781c58d-f75c-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            f6d8c5be-01a5-11ed-877c-90e2ba266fca  ONLINE       0     0     0
            ea14bc9f-0203-11ed-877c-90e2ba266fca  ONLINE       0     0     0
            4441e149-025e-11ed-9a62-90e2ba266fca  ONLINE       0     0     0
            8a72c42a-f7d2-11ec-8426-90e2ba266fca  ONLINE       0     0     0
            4d3243a6-0153-11ed-877c-90e2ba266fca  ONLINE       0     0     0
          raidz2-5                                ONLINE       0     0     0
            sds2                                  ONLINE       0     0     0
            0fa198ca-591a-11ec-99c7-90e2ba266fca  ONLINE       0     0     0
            sdw2                                  ONLINE       0     0     0
            sdv2                                  ONLINE       0     0     0
            10f313e6-591a-11ec-99c7-90e2ba266fca  ONLINE       0     0     0
            sdx2                                  ONLINE       0     0     0


Thanks!
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Which version of SCALE are you running?
 

Evan Richardson

Explorer
Joined
Dec 11, 2015
Messages
76
ok with enough googling you can find anything. I think i got it working. Doing an inline replace after offlining the drive didn't do anything, got a message about sdX being part of an active pool. The way I got around this was to do the following:

Code:
zpool labelclear -f sdX2
zpool replace sirius sdX2 115b2ec3-591a-11ec-99c7-90e2ba266fca


so if you're trying to replace sda2 with it's partid of 12345-abcd-7890, you'd do:

Code:
zpool offline pool sda2
zpool labelclear -f sda2
zpool replace pool sda2 12345-abcd-7890


this will replace the drive in place, although require a resilver. I have not found a way to replace the drive by UUID without requiring a full resliver (the labelclear basically removes zfs info from the drive forcing a full resilver, which for me takes ~13-14 hours per drive)
 
Last edited:
Top