Upgrading drives, fail/swap or sync to new array?

cmh · Feb 13, 2018

Sorry if this has been asked before but when I use search the links just lead to blank pages. Tried different browsers and even had other folks look at the links I was getting, same result. Unsure what's up there.

I've got a simple NAS setup - four 4TB drives in a dual mirror stripe. Getting close to the 80% zone where everything is predicted to slow down, so I'm getting some 8TB drives for the upgrade.

Trying to decide between doing the upgrade-in-place by failing one drive and rebuilding on the 8 and then working through the array until it's all 8TB drives, at which point it should see the increased space. This should be fine, but the issue is when I first set up the NAS I had just two drives, and put data on it. As I ran out of room I added the other two drives, which means that some of the data is heavily loaded on the first two spindles. I can see this in my graphite metrics on certain read-heavy operations as the read on those two spindles is much heavier at the start.

So before I do the swap I'm thinking I have a couple options, curious what folks think the best approach might be:

Leave it as is, just do the fail/swap to upgrade. Data's unbalanced but it's been like that for a couple years now and I don't notice any obvious ill effects.
Do the fail/swap to upgrade, then use zfs send/recv to create a new copy of the datasets, once they're in sync, rename the two so the new copy replaces the original. This way the data will be evenly loaded across all four disks.
Set up the array in a different host (no space to install the four drives in the current system) and zfs send/recv over to there, then once they're in sync, shut the hosts down and move the drives.
...something I haven't thought of yet.

#2 is tempting as I get the easy online upgrade and then I can hit the more important datasets first. Only not sure if renaming the zfs datasets after cloning might cause confusion for shares and such.

Not afraid of doing the zfs send/recv manually as I've done that several times with ZFS on linux and such.

Thanks!

Green750one · Feb 13, 2018

Personally I'd put the new drives in a new box in a single z2 array and move the data, or just upgrade the existing box to hold 8 drives so the same thing

Sent from my G3221 using Tapatalk

danb35 · Feb 13, 2018

Fourth option: you don't have to offline a disk in order to replace it. If you have the spare SATA port, simply plug in one of the new disks, go to Storage -> select your pool -> Volume Status, click one of the drives, click the Replace button, pick the new drive. Sit back, wait for resilvering to complete, and repeat. Once you replace both disks in one of your mirrored pairs, you'll see the pool expand; once you replace both disks in the second mirrored pair, it will expand again.

Though I'd also favor RAIDZ2 for disks of that size, the replacement in place would be much less disruptive.

Chris Moore · Feb 13, 2018

cmh said:
...something I haven't thought of yet.

I have moved my storage to a new system several times over the years and I have, at one time or another, done all of the options suggested. The one I like best is the option @Green750one presented, a whole new system. I did that when I moved from a RAIDz1 array of 1TB drives to a RAIDz2 array of 2TB drives. I have since added a second vdev and even more recently upgraded one of the vdevs to 4TB drives, but the point is this. With 8TB drives (or any size) as the capacity goes up, when you do have a drive failure, there is a greater possibility that you could have another drive failure before you are able to replace the failed drive. With mirrored vdevs (as you have) that could kill your whole pool. The capacity of a 4 drive RAIDz2 pool is almost the same as two mirrors using the same size drives. The difference is, with a RAIDz2 pool, you can have any two drives fail without loss of data. I would advocate for changing the way you arrange the drives and possibly even going to 5 or 6 drives in RAIDz2 instead of having mirrors. I am looking into a new storage system at work and I am desperately trying to avoid mirrors because there is so little protection against a drive fault.

cmh · Feb 14, 2018

I love the idea of a whole new system with more drive bay capacity... but I'm guessing nobody's going to send me one as a gift, and since this system isn't having any issues other than space, that's money I don't need to spend right now. Aside from that, yes, I agree, that would be my preferred solution.

I understand the recommendation for Z2, but I do have this system backed up to another older box (which will be getting the 4TB drives to replace the six 1TB drives in a Z1) and I've got cloud backups in addition to that, so I don't think going from the dual mirror to the Z2 is worth the time and effort. I understand the risk of a dual failure in one of the mirrors would offline me, but that's a bit of an edge case. If and when I replace this box with something newer, I'll likely be going with a Z2 in a box with more drive slots.

I do like the option of drive replacement vs. failing out the other one, but I'll have to see if I've got the extra SATA port for that.

The one thing I'm not hearing is any concern about my oldest data being heavily loaded on the first two drives vs evenly distributed.

Thanks for the feedback!

danb35 · Feb 14, 2018

cmh said:
The one thing I'm not hearing is any concern about my oldest data being heavily loaded on the first two drives vs evenly distributed.

Nope, no concern there.

Chris Moore · Feb 14, 2018

cmh said:
The one thing I'm not hearing is any concern about my oldest data being heavily loaded on the first two drives vs evenly distributed.

It might (and you commented on it) have an effect on you transfer speed, but the only way to fix it is to establish a new pool, even if it all drives are connected to the same server, and transfer the data from one pool to the other. Then it will be evenly distributed when it is written. Short of doing that, which takes more resources for connecting the additional drives, there is no solution to that. Over time as old files are deleted and new files are written, it will balance out to some extent.

A lot of my old movies were encoded as AVI files and I have been re-encoding them as M4V files because they play better through Plex, so when I write the new file to the NAS it is evenly distributed instead of being just on vdev0, then I delete the old file. Copy on write.

cmh · Feb 15, 2018

Chris Moore said:
It might (and you commented on it) have an effect on you transfer speed, but the only way to fix it is to establish a new pool, even if it all drives are connected to the same server...

ahhh, but if I were to "zfs send sto/photos | zfs recv sto/newphotos" to create a new dataset, and then renamed sto/photos to sto/oldphotos and sto/newphotos to sto/photos - then I've copied the old data to new disk blocks, which should be evenly distributed. Once I confirm that sto/photos is the proper thing I can delete the original - no? Space wouldn't be an issue as I'm just shy of 80% on the 4x4TB array so I'll have appx 40% on the 4x8TB array.

My only concern is if any shares, NFS, webdav, etc - might somehow be locked to the original sto/photos and thus be messed up after I diddled with making copies and renaming things.

Previously I haven't had the free space to even try this, but it was a thought that I've had in the past.

Chris Moore · Feb 15, 2018

cmh said:
My only concern is if any shares, NFS, webdav, etc - might somehow be locked to the original sto/photos and thus be messed up after I diddled with making copies and renaming things.

I have done something similar in the past for a different reason and it was really easy (in my case) to reconnect the share because I don't have any complicated permissions configured. Theoretically, it should work for you.
I have reconnected a share after doing something similar to that by simply stopping the SMB service and starting it again. It might be worth trying.

I would suggest stopping the sharing service, do the rename of the folders, then start the service. It should come right back with almost no down time.

cmh · Feb 15, 2018

Yeah, that's what I was thinking - should be pretty simple, just quiesce the host as much as possible before doing the final renaming. Thanks for the feedback!

Important Announcement for the TrueNAS Community.

Upgrading drives, fail/swap or sync to new array?

cmh

Explorer

Green750one

Dabbler

danb35

Hall of Famer

Chris Moore

Hall of Famer

cmh

Explorer

danb35

Hall of Famer

Chris Moore

Hall of Famer

cmh

Explorer

Chris Moore

Hall of Famer

cmh

Explorer

Similar threads

Important Announcement for the TrueNAS Community.

Upgrading drives, fail/swap or sync to new array?

Explorer

Dabbler

Hall of Famer

Hall of Famer

Explorer

Hall of Famer

Hall of Famer

Explorer

Hall of Famer

Explorer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Upgrading drives, fail/swap or sync to new array?"

Similar threads