Upgrading the disks with silvering method?

Daisuke · Sep 14, 2012

At the rhythm my 7TB RaidZ2 array is getting filled, I'm looking to upgrade the disks within 6 months. I don't plan to build a separate system but rather replace one disk at the time and resilver everything. Basically, I plan to move from WD 2TB disks to Hitachi Deskstar 4TB disks so that should expand my current array size to 14TB. The X7SPA-HF-D525-O motherboard handles fine 4TB drives, based on what Google says. Obviously, I will not use the disks at their full speed capacity but I'm happy with a 80MB/sec transfer rate I get now.

Now, can you please advice what is the proper way to replace and resilver the array disks? I'm wondering how long it could take to resilver each disk on a 5TB volume.

ProtoSD · Sep 14, 2012

Hi TECK,

I believe the current documentation has the correct procedure for doing what you ask. Thanks for posting about the 4TB disks, I think I am going to do the same, although I'd like to convert it to a 6 disk z2 array instead of the 5 disk array I currently have.

The documentation says "failed drive", but it is the same to replace with a larger drive.

http://doc.freenas.org/index.php/Volumes#Replacing_a_Failed_Drive_or_Zil_Device

Daisuke · Sep 14, 2012

Thanks for the link, Proto. Based in your experience, how long you think it will take to resilver a disk? I'm curious how you will increase the array size, I guess you will copy your data elsewhere in order to define a proper RaidZ2 volume?

ProtoSD · Sep 14, 2012

On our systems, I think the last time I replaced a disk it took 10 hours. Since the array doesn't expand until all the disks are replaced, the resilver shouldn't be any longer replacing a 2TB with a 4TB. After the array is expanded and those disks get more data added, THEN it could take AWHILE...

paleoN · Sep 14, 2012

TECK said:
I'm wondering how long it could take to resilver each disk on a 5TB volume.

Depends on a number of things. The amount of data in the zpool, load on the array and lest we forget the base FreeBSD ZFS code, there are changes 8.2 vs 8.3.

TECK said:
Now, can you please advice what is the proper way to replace and resilver the array disks?

The proper way is what protosd already said.

Of course if you are feeling adventurous there is an alternative that is faster and places less strain on all the drives. Ran across a post by PJD@ earlier today and it also works under Solaris as well.

ProtoSD · Sep 14, 2012

paleoN said:
Of course if you are feeling adventurous there is an alternative that is faster and places less strain on all the drives. Ran across a post by PJD@ earlier today and it also works under Solaris as well.

Hmmm, interesting. I would think just to be safe it would be good to do a scrub after this method, but I guess that defeats the purpose of doing it this way.

I would also highly recommend using ddrescue versus dd. It allows you to resume the copy exactly where it was interrupted, and is just nice if you want to pause for some reason and resume later. Don't forget to include the log option if you do use ddrescue. Also, there are 2 versions of ddrescue, I prefer not to use the GNU version.

paleoN · Sep 14, 2012

protosd said:
I would also highly recommend using ddrescue versus dd.

Wasn't thinking about that, good idea.

Given that you are increasing the size of the drives, at the least you will 'lose' the two redundant ZFS labels at the end of the drive. They will end up somewhere in the middle. I believe they will be recreated when you online the disk. You can always check with:

Code:

zdb -l /dev/adaXpX

Or whatever you used to make the zpool with.

Now if you had all the replacement disks on hand to begin with this is much safer.

Backup your FreeNAS config
Export pool
Match serials between old & new disks
In a separate box, if you have one, carefully ddrescue old disk/partition to new disk/partition.
You can do this at the same time for as many pairs as you have SATA ports. :D
Install new dd'ed disks into the NAS and power it on.
Import pool

If all goes as planned the pool should autoexpand after import and the ZFS labels should be fixed, I think.

Instead if it blows up in your face, wipe the metadata from the new drives, remove the new drives, restore your old config, put back the old drives and do it the proper way.

Daisuke · Sep 15, 2012

Thank you for the useful info, guys. Unfortunately, the data I have on my NAS does not allow me to "play" with alternative solutions. If someone tested the ddresque procedure, I will gladly do it if it works 100%. Otherwise, I would rather stick with the longer but safe alternative.

I will definitely order all disks at once. I have a spare box running on CentOS 6 but I'm not comfortable fiddling with partitions and changing disk serials. I don't even know where the disk serials are changed, obviously it has to be on some partition data. Anyways, I cannot take the chance to "hope" things will work, I need some confirmation from anyone else who did it successfully and increased their array safely with the method you mentioned.

Daisuke · Sep 15, 2012

There is something unclear on the FreeNAS disk replacement procedure:
"Once the disk is showing as OFFLINE, click the disk's Replace button. Select the replacement disk from the drop-down menu and click the Replace Disk button. If the disk is being added to a ZFS pool, it will start to resilver."

In theory, if I swap the disks, the new disk should retain the same ID (i.e. ada1) and will be inserted into array as ada1p2. I hope FreeNAS will not see the new disk with a incremental value ada7 and add it to array as ada7p2?

cyberjock · Sep 15, 2012

The disk retaining the same ID (I think) is based on the hardware and how the driver works. Ultimately it really doesn't matter if it's assigned ata152. When you reboot the numbers will be reset from 0 and it'll all still work. ;)

I did an upgrade on one of my FreeNAS servers by resilvering each disk. I had 2 vdevs so I did 1 disk in each VDEV at the same time. What I did:

1. Shutdown the server
2. Pulled out 1 drive from each vdev(Both were RAIDZ2)
3. Inserted 2 new disks.
4. Booted up the server.
5. Used the "replace" button and selected a new disk.(resilvering would start after this step)
6. Used the "replace" button for the other disk and selected the other new disk.(resilvering would re-start after this step)
7. Waiting for resilvering to complete.
8. Go to step 1.

I had zero issues except having to wait for the resilvering.

paleoN · Sep 16, 2012

TECK said:
In theory, if I swap the disks, the new disk should retain the same ID (i.e. ada1) and will be inserted into array as ada1p2. I hope FreeNAS will not see the new disk with a incremental value ada7 and add it to array as ada7p2?

Do you have an empty bay available? From what I've read it's considered better to replace a drive while the original is still around. Given that you are using double-parity it's not as much of a concern.

Assuming you have the new disk plugged into the same SATA port, I would not expect the disks to be enumerated any differently. If you are concerned about it you can always wire down the device names to a particular port.

TECK said:
I have a spare box running on CentOS 6 but I'm not comfortable fiddling with partitions and changing disk serials.

The match serials step was for keeping track of which disks are which. On a piece of paper record the new disk serial #NEW***xx and beside it the old disk serial #OLD***xx. Then install the new disk in the correct spot, not that it will matter much to ZFS. You don't, not to mention can't, change the disk serial number using the method above.

No partition fiddling necessary. You would clone the contents of the partition if you used partitons. If you used whole disks you don't have partitions and don't have to think about them.

TECK said:
Anyways, I cannot take the chance to "hope" things will work, I need some confirmation from anyone else who did it successfully and increased their array safely with the method you mentioned.

The above method is safe, as safe as anything is with dd/ddrescue commands. If you are uncomfortable with them, which is understandable, then you should do it the proper way. As far as confirmation, a scrub will confirm it worked. No hope needed.

When i first posted the above, I don't think I would have tried it with my own data. I still wouldn't if I didn't have all the disks at the start. I would try it myself, but I still need to expand the number of drives I have.

paleoN · Sep 16, 2012

The "proper way" assuming 10 hours per resliver with 6 disks is 60 hours.

The adventurous way is quite a bit less. You have at least 7 SATA ports available which means you can do up to 3 pairs of disks at once. Assuming a speed of 100 MB/s with 2TB to clone will take about 5.8 hours per disk. You can run 3 clones in parallel. For a grand total of 11.6, let's say an even 12 hours.

That's 2 full days, 48 hours, less!

TECK, given that you aren't doing this for another 6 months perhaps someone else will come along and try it first.

Daisuke · Sep 16, 2012

paleoN said:
Do you have an empty bay available?

I don't, all SATA ports are full.

paleoN said:
The "proper way" assuming 10 hours per resliver with 6 disks is 60 hours.

That is not an issue for me. I plan to get the 6 disks (I expect to spend about $1,500) and do 2 disks/day. One in the morning before I leave for work and one at night when I arrive at home. It should take me 3 days to finish everything and will have a lot of spare data once completed.

paleoN · Sep 17, 2012

TECK said:
It should take me 3 days to finish everything and will have a lot of spare data once completed.

So, about 72 hours. Your procedure is sound, more than reasonable and shouldn't have much impact on you as you have noted.

For those interested in the adventurous way, I have successfully tested it in a FreeNAS-8.2.0-RELEASE-p1-x64 VM with a 3 x raidz1 array.

The original disks were 10GB each, 8GB partitions. The new disks are 20GB each, 18GB partitions. Following the procedure I outlined in [post=37607]post #7[/post] all worked as I thought.

The new disks only 'had' the first 2 ZFS labels as the other two were in the middle of the partition and not where they belong. What's more they refer to the old disk names. ZFS, zpool import, takes all this in stride for us. :) On import the ZFS labels were updated and the pool autoexpanded to 36GB. Ran a scrub and it came back clean.

Advantages:

Orders of magnitude faster depending on the number of drives & pool configuration
A pristine, untouched "backup" of your pool, it's the complete original pool, on the old drives
Very low stress on all the drives involved
One long continuous streaming read/write is among the easiest workloads for a hard drive

Disadvantages:

Shall we say somewhat technical :p
Error prone & unforgiving
Know with absolute certainly which disks are which and what ddrescue commands you are running
Others I haven't thought about?

I should be nuking my mirror next week or even this week in the course of some testing. I think I will simulate this by manually partitioning them to a 1TB partition and later growing them to their full size.

cyberjock · Sep 17, 2012

Here's some advice.. we found out after we were 3/4 of the way through. Our drives were setup as individual disks on a RAID controller. By enabling disk writes and disabling all read caching on the controller we shaved off almost 35% from the resilvering time. This seems logical since ZFS has it's own read cache that is much smarter than the controller. The write cache seemed to optimize the write pattern to the disks being resilvered.

paleoN · Sep 17, 2012

noobsauce80 said:
This seems logical since ZFS has it's own read cache that is much smarter than the controller. The write cache seemed to optimize the write pattern to the disks being resilvered.

Yes, I've read as much. In fact depending on the number of disks behind the controller and the workload you may get even greater performance by turning off the write cache. A controller flashed to IT mode, if possible, can also see an additional increase in performance.

AFAIK, a resilver should be limited by, among other things, the random write performance of the resilvering disk(s). A heavily fragmented pool will make this workload even worse. Assuming you are trying to avoid using the pool during the resilver I would be surprised if turning off the write cache would be any faster.

If I'm not mistaken TECK has a HBA anyway and not a RAID card.

Daisuke · Sep 17, 2012

paleoN said:
A heavily fragmented pool will make this workload even worse.

Will a scrub help before I perform the upgrade? Is there a better method to defrag the volumes?

paleoN said:
If I'm not mistaken TECK has a HBA anyway and not a RAID card.

Correct, I use the 6 SATA ports available on Supermicro motherboard.

paleoN · Sep 17, 2012

TECK said:
Will a scrub help before I perform the upgrade? Is there a better method to defrag the volumes?

There is currently only one way to defrag ZFS. Backup the entire pool, zfs send/rsync, destroy the original pool, recreate the pool and restore from backup.

What makes you think your pool is heavily fragmented? Two of the more significant things that cause fragmentation are creating & deleting lots & lots of snapshots and using up all the free space in your pool.

Daisuke · Sep 17, 2012

paleoN said:
What makes you think your pool is heavily fragmented?

Is not, I have only few fragments. I was just curious what is the proper way to defrag a volume.

Important Announcement for the TrueNAS Community.

Upgrading the disks with silvering method?

Contributor

MVP

Contributor

MVP

Wizard

MVP

Wizard

Contributor

Contributor

Inactive Account

Wizard

Wizard

Contributor

Wizard

Inactive Account

Wizard

Contributor

Wizard

Contributor

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Upgrading the disks with silvering method?"

Similar threads