Drive Upgrade and Archiving

Patrick Ryan · Dec 11, 2023

By way of background, I have two iXSystem Mini 2's configured as office file servers, running very old versions of FreeNAS (9.10). They were nearing full capacity, so I recently built a new one from scratch using a spare chassis and new WD Red 8TB x 4 drives in a RAID Z2 configuration. (I know the Mini 2 is EOL, but since I had a spare one, I figured I could keep using them until at least one failed and I cycled in my spare.) The replacement is now up and running with the latest version of TrueNAS, so all good. I now still have a spare (just a different chassis), so I'll go ahead and repeat the process with the other in-service old configuration. I don't intend to let the versions lapse on them this time, so I'm hoping that this kind of bare-metal build won't be necessary with any future upgrades. If I get to the point that we're nearing full capacity again, and I still haven't had to replace a permanently failed chassis, I'd like to just swap out the HDD's - which leads me to my questions:
1. Is that architecture true hot swap? That is, pop out the existing drives one by one with no power cycling?
2. Should I offline each drive first? Or just yank the drive with no warning to the system?
3. Assuming I let every new drive fully resilver before moving to the next in line, but have the server shares disabled so no new data can be written, what condition are the (working) removed drives left in? Could I theoretically stuff them into a different chassis and have a working RAID array? I like the idea of having an archival copy of the existing array as a hedge against future data loss, so this would be nice.

Arwen · Dec 11, 2023

Some answers:

I don't know, sorry. Perhaps someone else will know. But, you can always offline the disks and shutdown the server to perform a cold swap of the disk.
The documentation is supposed to be pretty clear, (I've not read that section). See the Documentation link at the top of every forum page.
No. The only way to get a working archive from old disks in a pool, is with all Mirrors.
And no, you DON'T have to stop any services or shares. They can continue to work as normal. Perhaps a bit slower...

For #3, let us say you have 2 vDevs of 2 disk Mirrors. You can break off 1 disk from each vDev, and create a new pool. Then perhaps add new disks to either pool for re-Mirroring. See the zpool split command's manual page for details.

However, removing the disks one at a time from a RAID-Zx vDev does not allow those disks to be "clean". Whence removed, they start to age, ZFS transaction wise. And the next disk removed will be at a different state in the ZFS transaction age. Meaning they would be incompatible with making a usable pool. Perhaps a data recovery service might make sense out of the mess...

Patrick Ryan · Dec 11, 2023

Arwen said:
Some answers:

I don't know, sorry. Perhaps someone else will know. But, you can always offline the disks and shutdown the server to perform a cold swap of the disk.

The documentation is supposed to be pretty clear, (I've not read that section). See the Documentation link at the top of every forum page.

All the documentation around replacing disks seems to assume a failed (or failing) disk. I can't find any documentation that deals specifically with increasing pool size by swapping in larger disks, or whether there's a preferred technique for doing that. (For that matter, is my current technique, of preparing a new chassis and drives, and rsyncing over the contents, a safer approach then relying on resilvering?)

Arwen said:
However, removing the disks one at a time from a RAID-Zx vDev does not allow those disks to be "clean". Whence removed, they start to age, ZFS transaction wise. And the next disk removed will be at a different state in the ZFS transaction age. Meaning they would be incompatible with making a usable pool. Perhaps a data recovery service might make sense out of the mess...

I wondered whether the combination of not offlining the drives, and preventing new data from being written to the pool by disabling shares, would limit / eliminate any kind of ZFS activity to the point where the pool would still exist in a usable state on the removed drives. If it can't, no big deal - at least I know what scenario to plan for, hopefully several years from now. It certainly pushes me towards thinking that the best approach might be to build out new drives in a separate chassis and copying the contents over from the in-use server.

Thanks!

Arwen · Dec 11, 2023

Patrick Ryan said:
All the documentation around replacing disks seems to assume a failed (or failing) disk. I can't find any documentation that deals specifically with increasing pool size by swapping in larger disks, or whether there's a preferred technique for doing that. (For that matter, is my current technique, of preparing a new chassis and drives, and rsyncing over the contents, a safer approach then relying on resilvering?)
...

It depends on what you want to accomplish. RSyncing won't necessarily copy over MS-Windows access control information.

A better method might be:

Make recursive snapshot on the old pool
Make an empty pool on the new server
Use zfs send | ssh NEW_SERVER zfs receive to copy the pool from the old server to the new
Delete the recursive snapshots on both old and new pools

Their are multiple ways to increase a ZFS pool size:

Replace existing drives with larger ones
Add new vDev to the pool
Use RSync to copy over the pool's data to a newer, larger pool
Use ZFS Send & Receive to copy over the old pool's data to a larger pool

The simplest is replace existing drives with larger ones.

Davvo · Dec 12, 2023

Patrick Ryan said:
new WD Red 8TB x 4 drives in a RAID Z2 configuration.

WD Red are SMR, you want CMR drives.
WD Red Plus and Pro are CMR.

Etorix · Dec 12, 2023

Patrick Ryan said:
All the documentation around replacing disks seems to assume a failed (or failing) disk. I can't find any documentation that deals specifically with increasing pool size by swapping in larger disks, or whether there's a preferred technique for doing that.

The process is the same in any case: Put in the new drive, and replace (Storage>Pool>(gear)>Status>(drive)>(…)>Replace).
If you have a spare SATA port and the old drive is (partially) working, the preferred procedure is to do that with the old drive still in place (no loss of redundancy).
Otherwise, offline the old drive, take it out and put the new drive in its place.
(An intermediate option would be to put the new drive in place, but attach the old drive to a USB adapter, so it can still contribute redundant data.)

Except when splitting a miroor pool, there's no way to resilver and keep a working copy of the pool as it were.

Patrick Ryan · Dec 13, 2023

Arwen said:
A better method might be:

Make recursive snapshot on the old pool

Make an empty pool on the new server

Use zfs send | ssh NEW_SERVER zfs receive to copy the pool from the old server to the new

Delete the recursive snapshots on both old and new pools

That sounds like the exactly correct solution for my use case, thanks. Just our of curiosity, what does the snapshot accomplish in that case? Couldn't I just send the pool?

Arwen said:
Their are multiple ways to increase a ZFS pool size:

Replace existing drives with larger ones

Add new vDev to the pool

Use RSync to copy over the pool's data to a newer, larger pool

Use ZFS Send & Receive to copy over the old pool's data to a larger pool

The simplest is replace existing drives with larger ones.

Well, if I'm already using all four of my drive bays with my existing pool, I don't really have any options for adding new vDevs or pools, right? And I get that just replacing the drives one-by-one is probably the easiest solution, but I'm a bit put off by that approach since it relies 100% on the resilvering process succeeding. (In addition to rendering the old drives useless for disaster recovery.) If I copy the pool to the new server, I can accommodate a failure partway through the process and just start again. And once I'm done I can power down the existing server, pop out the drives, and have an archival copy if old file recovery ever becomes necessary.

Patrick Ryan · Dec 13, 2023

Davvo said:
WD Red are SMR, you want CMR drives.
WD Red Plus and Pro are CMR.

Sorry, should have been more precise - I'm using WD Red Plus. Thanks!

Patrick Ryan · Dec 13, 2023

Etorix said:
The process is the same in any case: Put in the new drive, and replace (Storage>Pool>(gear)>Status>(drive)>(…)>Replace).
If you have a spare SATA port and the old drive is (partially) working, the preferred procedure is to do that with the old drive still in place (no loss of redundancy).
Otherwise, offline the old drive, take it out and put the new drive in its place.
(An intermediate option would be to put the new drive in place, but attach the old drive to a USB adapter, so it can still contribute redundant data.)

Except when splitting a miroor pool, there's no way to resilver and keep a working copy of the pool as it were.

Thanks. As I just indicated in my reply to Arwen, since I've got spare hardware, I think copying the pool over to a new machine is the safest solution, so I don't think I'm going to bother with resilvering. (Unless I have a drive failure, of course!) Cheers.

Arwen · Dec 13, 2023

Patrick Ryan said:
...
Couldn't I just send the pool?
...

No, the ONLY way to get a clean, raw copy of a pool, (or individual datasets), is a R/O snapshot. See, for ZFS Send to be fast, it reads the data at a lower level than the file system level. But, if the file system changes, chaos results. ZFS Send was designed to require snapshots, (which are all R/O).

So, we make a full snapshot of the pool, (aka recursive snapshot of the top level dataset). Then use that Read Only path to the pool's data for the ZFS Send.

Patrick Ryan said:
That sounds like the exactly correct solution for my use case, thanks. Just our of curiosity, what does the snapshot accomplish in that case?
...

A PERFECTLY clean copy.

In theory, if nothing is changing on the pool and it's datasets, then a RSync can also make a perfectly clean copy.

Here is the rough syntax for the ZFS Send & Receive. I use something very similar to copy my Linux computer's OS pool to alternate media for recovery purposes.

Code:

zfs snapshot -r ${MY_SRC_POOL}@${MY_SNAP}
zfs send  -Rpv  ${MY_SRC_POOL}@${MY_SNAP} | ssh NEW_SERVER zfs receive -dFu ${MY_DEST_POOL}
zfs destroy -rv ${MY_SRC_POOL}@${MY_SNAP}
ssh NEW_SERVER zfs destroy -rv ${MY_DEST_POOL}@${MY_SNAP}

Replace the various variables with the real names, (source pool, snapshot name, new server name, destination pool).

Patrick Ryan · Dec 14, 2023

Thanks for the clarification and the commands. I'll give that a try.

Important Announcement for the TrueNAS Community.

Drive Upgrade and Archiving

Patrick Ryan

Dabbler

Arwen

MVP

Patrick Ryan

Dabbler

Arwen

MVP

Davvo

MVP

Etorix

Wizard

Patrick Ryan

Dabbler

Patrick Ryan

Dabbler

Patrick Ryan

Dabbler

Arwen

MVP

Patrick Ryan

Dabbler

Similar threads