RAID-0 proactively replacing a failing drive

ericsmith881 · Feb 25, 2022

This is a hypothetical case I'm exploring. Let's say I have a RAID-0 ZFS pool of 12 8TB drives. I use this pool for fast access to data that I don't care about if I lose because it can be re-created (albeit it would take a while). After running the array for a bit, I note one of the drives is showing bad S.M.A.R.T stats. It hasn't failed yet but I want to replace it before it does. The pool is about 60% full.

How would I go about doing this without losing data? Is this even possible? Backing up and restoring is not a viable option as it would take longer to back up the data than to re-create it.

HoneyBadger · Feb 25, 2022

Hypothetically speaking.

You would attach the new "spare" drive to the failing drive's vdev (I believe the UI allows this, check the three dots/cogwheel option beside the failing vdev) to create a mirror. ZFS copies all of the data over. Then you detach the failing drive.

Non-hypothetically, you weigh out the cost of your time rebuilding/repairing the array if a single drive fails, and consider also that you have zero redundancy at a file/record level, so ZFS can detect corruption but not correct it.

You also make sure none of those drives are SMR, while you're here.

Samuel Tai · Feb 25, 2022

Please use the proper terminology for ZFS. From your OP, I assume your pool is constructed as a stripe of 12x 8 TB drives. It is possible to swap out a VDEV in a stripe pool, but you'll likely need to do it from the command line, and there's a non-trivial chance you'll lose your pool, and thus your data. Rebuilding from backup is your only option then.

For argument's sake, let's say your pool is constructed like so:

Code:

# zpool status tank

pool: tank
 state: ONLINE
  scan:
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          gptid/<disk 0's GUID>                         ONLINE       0     0     0
          gptid/<disk 1's GUID>                         ONLINE       0     0     0
          gptid/<disk 2's GUID>                         ONLINE       0     0     0
          gptid/<disk 3's GUID>                         ONLINE       0     0     0
          gptid/<disk 4's GUID>                         ONLINE       0     0     0
          gptid/<disk 5's GUID>                         ONLINE       0     0     0
          gptid/<disk 6's GUID>                         ONLINE       0     0     0
          gptid/<disk 7's GUID>                         ONLINE       0     0     0
          gptid/<disk 8's GUID>                         ONLINE       0     0     0
          gptid/<disk 9's GUID>                         ONLINE       0     0     0
          gptid/<disk 10's GUID>                        ONLINE       0     0     0
          gptid/<disk 11's GUID>                        ONLINE       0     0     0

errors: No known data errors

Let's say drive da0 is the one with SMART errors. Connect a new drive to your controller, so it comes up as da12. For that stripe, create a mirror via zpool attach tank gptid/<disk 0's GUID> da12. This will create a mirror VDEV at that stripe position over both da0 and da12.

After the resilver completes, then detach da0 from the pool via zpool detach tank gptid/<disk 0's GUID> to convert the mirror VDEV back to a singleton.

You can then physically pull da0.

HoneyBadger · Feb 25, 2022

Samuel Tai said:
You can then physically pull da0.

Important that you pull exactly da0 and only da0. Otherwise your entire pool is d0a

ericsmith881 · Feb 25, 2022

HoneyBadger said:
Hypothetically speaking.

You would attach the new "spare" drive to the failing drive's vdev (I believe the UI allows this, check the three dots/cogwheel option beside the failing vdev) to create a mirror. ZFS copies all of the data over. Then you detach the failing drive.

Non-hypothetically, you weigh out the cost of your time rebuilding/repairing the array if a single drive fails, and consider also that you have zero redundancy at a file/record level, so ZFS can detect corruption but not correct it.

You also make sure none of those drives are SMR, while you're here.

Definitely a no on the SMR stuff. Avoiding those like the plague.

As for attaching a spare, I kinda figured that would be the case but wasn't sure. Also, if the enclosure is full and adding a spare is not an option, I presume there is no way to "downsize" the pool (there is, after all, plenty of free space) to exclude the failing drive?

As for cost of rebuild, it's a weird case. The data would take less than a week to re-create. Backing up and restoring 40TB would take up more time than that (it's an older LTO). RAID-0 is preferred due to speed, with redundancy being unnecessary since the data can be recovered. Loss of data is more of an inconvenience thing in this case.

ericsmith881 · Feb 25, 2022

Samuel Tai said:
Please use the proper terminology for ZFS. From your OP, I assume your pool is constructed as a stripe of 12x 8 TB drives. It is possible to swap out a VDEV in a stripe pool, but you'll likely need to do it from the command line, and there's a non-trivial chance you'll lose your pool, and thus your data. Rebuilding from backup is your only option then.

For argument's sake, let's say your pool is constructed like so:

Code:
# zpool status tank pool: tank state: ONLINE scan: config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 gptid/<disk 0's GUID> ONLINE 0 0 0 gptid/<disk 1's GUID> ONLINE 0 0 0 gptid/<disk 2's GUID> ONLINE 0 0 0 gptid/<disk 3's GUID> ONLINE 0 0 0 gptid/<disk 4's GUID> ONLINE 0 0 0 gptid/<disk 5's GUID> ONLINE 0 0 0 gptid/<disk 6's GUID> ONLINE 0 0 0 gptid/<disk 7's GUID> ONLINE 0 0 0 gptid/<disk 8's GUID> ONLINE 0 0 0 gptid/<disk 9's GUID> ONLINE 0 0 0 gptid/<disk 10's GUID> ONLINE 0 0 0 gptid/<disk 11's GUID> ONLINE 0 0 0 errors: No known data errors

Let's say drive da0 is the one with SMART errors. Connect a new drive to your controller, so it comes up as da12. For that stripe, create a mirror via zpool attach tank gptid/<disk 0's GUID> da12. This will create a mirror VDEV at that stripe position over both da0 and da12.

After the resilver completes, then detach da0 from the pool via zpool detach tank gptid/<disk 0's GUID> to convert the mirror VDEV back to a singleton.

You can then physically pull da0.

Sorry about the terminology. Old habits die hard. Yes, it's a stripe of 12x 8TB. Your solution is kinda what I figured. Main problem is the enclosure only holds 12 drives so adding a 13th wouldn't be an option. I suppose I could rig up some other way to attach a drive if I had to but it would probably be more work than it's worth. I assume downsizing the vdev to exclude the failing drive isn't an option? If there was a way to tell TrueNAS "hey, start moving everything off da6" and, when it's done, I replace the failing drive. I can live with 11x 8TB until a convenient time to wipe the whole array and re-create as a 12x 8TB, as I know there's no way to "upsize" an array without recreating it from scratch.

HoneyBadger · Feb 25, 2022

ericsmith881 said:
As for attaching a spare, I kinda figured that would be the case but wasn't sure. Also, if the enclosure is full and adding a spare is not an option, I presume there is no way to "downsize" the pool (there is, after all, plenty of free space) to exclude the failing drive?

In a stripe pool there is. All devices are top-level vdevs, so you could zpool remove the failing member disk, physically replace it, and then extend the pool with a new vdev.

You end up with a little bit of memory usage for "redirection pointers" of the data that was on that drive, but it does work here.

ericsmith881 · Feb 25, 2022

HoneyBadger said:
In a stripe pool there is. All devices are top-level vdevs, so you could zpool remove the failing member disk, physically replace it, and then extend the pool with a new vdev.

You end up with a little bit of memory usage for "redirection pointers" of the data that was on that drive, but it does work here.

Hold up. You mean doing a zpool remove will work without data loss?

HoneyBadger · Feb 25, 2022

ericsmith881 said:
Hold up. You mean doing a zpool remove will work without data loss?

Yep. Somewhat recent introduction to the OpenZFS toolbox but controlled removal of certain devices (basically anything where RAIDZ isn't present) is possible.

zpool-remove.8 — OpenZFS documentation

openzfs.github.io

Removes the specified device from the pool. This command supports removing hot spare, cache, log, and both mirrored and non-redundant primary top-level vdevs, including dedup and special vdevs.
Top-level vdevs can only be removed if the primary pool storage does not contain a top-level raidz vdev, all top-level vdevs have the same sector size, and the keys for all encrypted datasets are loaded.

ericsmith881 · Feb 25, 2022

HoneyBadger said:
Yep. Somewhat recent introduction to the OpenZFS toolbox but controlled removal of certain devices (basically anything where RAIDZ isn't present) is possible.

zpool-remove.8 — OpenZFS documentation

openzfs.github.io

This is fantastic news! I had no idea this was possible. I assume since this is a relatively new thing there is no way to do it from the GUI and it's CLI-only, right?

danb35 · Feb 25, 2022

ericsmith881 said:
I assume since this is a relatively new thing there is no way to do it from the GUI and it's CLI-only, right?

No, you can do it from the GUI in certain cases--but I think it's only available there when you're dealing with mirrors.

Arwen · Feb 27, 2022

Uh, why the mirroring step?

Simply;
zpool replace tank BAD_DISK NEW_DISK
Whence the NEW_DISK is synced up, the BAD_DISK is automatically removed from the pool. No need for a 2 step process. I've done it a while back, it works.

Of course, their is other work, (installing & partitioning the NEW_DISK same as the others in the pool), and then removing the correct BAD_DISK.

danb35 · Feb 27, 2022

Arwen said:
Of course, their is other work, (installing & partitioning the NEW_DISK same as the others in the pool), and then removing the correct BAD_DISK.

...or just do it through the GUI, which handles the partitioning and formatting. But you still need to install/remove the relevant hardware.

Important Announcement for the TrueNAS Community.

RAID-0 proactively replacing a failing drive

ericsmith881

Dabbler

HoneyBadger

actually does care

Samuel Tai

Never underestimate your own stupidity

HoneyBadger

actually does care

ericsmith881

Dabbler

ericsmith881

Dabbler

HoneyBadger

actually does care

ericsmith881

Dabbler

HoneyBadger

actually does care

zpool-remove.8 — OpenZFS documentation

ericsmith881

Dabbler

zpool-remove.8 — OpenZFS documentation

danb35

Hall of Famer

Arwen

MVP

danb35

Hall of Famer

Similar threads

Important Announcement for the TrueNAS Community.

RAID-0 proactively replacing a failing drive

Dabbler

actually does care

Never underestimate your own stupidity

actually does care

Dabbler

Dabbler

actually does care

Dabbler

actually does care

Dabbler

Hall of Famer

MVP

Hall of Famer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "RAID-0 proactively replacing a failing drive"

Similar threads