faster way to replace SATA HDDs with SAS HDDs

digity

Contributor
Joined
Apr 24, 2016
Messages
156
My 8 bay TrueNAS server has all 6 TB SATA HDDs, but I need to use SATA HDDs in other projects. The HBA and backplane is SAS compatible and I have a ton of spare 6 TB SAS HDDs. Do I have to go through the long process of replacing each SATA HDD with a SAS HDD & silvering, one at a time, or is there a faster way to swap out these drives?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
You DO need to replace each drive & resilver. Not sure why you think their is some magical method to have your cake and eat it too.

However, please list the type of redundancy you have in your TrueNAS. If you are using RAID-Z2 or -Z3, you can replace more than 1 drive at a time. However, if you replace 2 drives on a RAID-Z2 or 3 drives on a RAID-Z3, you do so without any redundancy. Be sure you have GOOD backups and a recent scrub without errors.

Further, someone here in the forums recently mentioned that he tried to replace 2 disks, but only 1 was active. It appeared the other was waiting for the first to finish. Not sure why that happened, or if it was a GUI bug.

Last, more recent versions of ZFS, (and TrueNAS that use them), have improvements in the disk resilver process. If you are using a older version of TrueNAS, a backup and update would be in order. I don't remember which TrueNAS version that specific improvement was available.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Do I have to go through the long process of replacing each SATA HDD with a SAS HDD & silvering, one at a time, or is there a faster way to swap out these drives?
If you have available ports, you can set up a second pool with the new disks and follow this process:
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
The one advantage of using the copy pool to another pool method that @sretalla mentioned, is that it removes much of the fragmentation.

The is yet another way. If you truly have enough extra disk ports & power to drive them, you can perform a replace in place with as many free ports & power you have available. So even 6 at a time. Server will be slow, though each disk being replaced sort of becomes a mirror disk of the source disk, until the process is complete. Then it auto-detaches the source disk. Whence all the new disks are re-synced, you can remove the old disks. And if needed, continue with the replacement with any freed disk ports.
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
I don't have space for more drives nor external HBA card as this is a mini-itx server. I've started the replace and re-silvering process and the first re-silver took about 28 hours, but this second re-silver has been running for over 63 hours now. It's currently at 24% and estimated remaining time is over 8 days and 10 hours. Any idea why this second re-silvering is taking significantly longer?


P.S. - Not sure if this matters, but with both replacements and re-silverings used space was initially full, but in the middle of both re-silvering instances I got used space down to 80% and 70% respectively.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Please list the manufacturer and exact model of both the existing disks, and the new replacement disks.

No one should be using a SMR, (Shingled Magnetic Recording), hard disk with TrueNAS & ZFS in any RAID configuration. However, I am not aware of any SAS hard disks that use SMR.
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
Please list the manufacturer and exact model of both the existing disks, and the new replacement disks.

No one should be using a SMR, (Shingled Magnetic Recording), hard disk with TrueNAS & ZFS in any RAID configuration. However, I am not aware of any SAS hard disks that use SMR.

The outgoing drive is a WD WD6002FZWX (6TB, 7200 RPM, 128 MB cache) and all the SAS replacements are all HGST HUS726060ALS640 (6TB, 7200 RPM, 64MB cache)
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Both seem to be CMR, so I don't have any other thoughts why your second disk replacement is going so slow.
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
This can potentially take a month to replace all 8 SATA drives. But, now that I think about it, I do have a 24 bay ATX server w/ 24 bay disk shelf setup from a recently shut down project (Chia) and all the spare 6TB SAS drives I want to replace the SATA drives are already in there. So, while this current pool is still resilvering, is it possible to...:

1) shutdown this TrueNAS server and put the boot SSD and 8 x 6TB pool drives (volume1) in the 24 bay server (I assume it'll resume resilvering upon boot up)
2) create a new 8x 6TB pool with all SAS drives (volume2) mirroring the outgoing volume1 setup (2 vdevs of 4 drives in RAIDZ1... or other suggestion)
3) replicate the volume1 pool/dataset to the volume2 pool/dataset (again, while volume1 likely still resilvering)
4) export the volume1 pool, physically remove volume1's drives, rename the volume2 pool to volume1
5) put the TrueNAS boot SSD and now all SAS based volume1 pool drives back in the mini-ITX chassis

The hope is to drastically speed up the process of replacing all SATA drives. Is the above possible and will it be significantly faster than waiting for the remaining 7 drives to resilver?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
2) create a new 8x 6TB pool with all SAS drives (volume2) mirroring the outgoing volume1 setup (2 vdevs of 4 drives in RAIDZ1... or other suggestion)
I'd suggest a single 8-drive RAIDZ2 here instead, but other than that your suggested plan of action should not only work but be much quicker after the rebuild completes - I/O during a RAIDZ rebuild is generally categorized as "poor"
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
I'd suggest a single 8-drive RAIDZ2 here instead, but other than that your suggested plan of action should not only work but be much quicker after the rebuild completes - I/O during a RAIDZ rebuild is generally categorized as "poor"

So wait the 8 days for the current resilvering to complete before performing the above migration to SAS drive pool?

Also, why 1x RAIDZ2 instead of 2x RAIDZ1 vdevs? I thought multiple vdevs was the way to go for faster I/O speeds? What if I wanted to easily expand the pool size in the future and didn't want to have to do it 8 drives at a time - is 2 vdevs then the better setup (for adding 4 drives at a time instead)?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
So wait the 8 days for the current resilvering to complete before performing the above migration to SAS drive pool?

Given that you've got two RAIDZ vdevs, I would personally wait for the current resilver to finish. You could start a copy but I'd expect the repair timer to skyrocket once you start the copy, and I would prioritize keeping your existing data safe.

Also, why 1x RAIDZ2 instead of 2x RAIDZ1 vdevs? I thought multiple vdevs was the way to go for faster I/O speeds? What if I wanted to easily expand the pool size in the future and didn't want to have to do it 8 drives at a time - is 2 vdevs then the better setup (for adding 4 drives at a time instead)?

I'd need to know more about the workload, but from reviewing some of the other questions you made, this pool is intended for backups and other sequential-heavy workloads.

Multiple vdevs are usually used when you're trying to push for lower latency on individual operations, or to create smaller failure domains once you reach a given number of total drives. Single RAIDZ2 gives you the same usable space (6 data drives) but allows you to survive any two disks failing simultaneously, whereas the 2x 4-RAIDZ1 will fail completely if you have two drives fail in the same vdev - and at this moment, you are inducing a single-drive failure with each rebuild. Once you're in the larger 24-bay chassis, you could add the new drives and perform an online replacement, but from a data safety perspective I'm in the consideration that Z2 is where you ought to be.

Expansion is a consideration, in that you'd be adding eight drives at a time, but with Z2 the "swap one to a larger disk, resilver" is possible while not losing redundancy. Just my thoughts though.
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
The resilver has finally completed, took 5 days and 3 hours (123 hours) instead of the estimated 8 days and 10 hours (202 hours). I now think it took significantly longer than the first resilver (28 hours) because of a full backup job (4+ TB to pool over SMB) was failing and re-running immediately in a loop and a SMART long test was running on the replacement drive. Once I stopped those things, resilver throughput started climbing from ~2 MB/s to ~95 MB/s.
 

digity

Contributor
Joined
Apr 24, 2016
Messages
156
This can potentially take a month to replace all 8 SATA drives. But, now that I think about it, I do have a 24 bay ATX server w/ 24 bay disk shelf setup from a recently shut down project (Chia) and all the spare 6TB SAS drives I want to replace the SATA drives are already in there. So, while this current pool is still resilvering, is it possible to...:

1) shutdown this TrueNAS server and put the boot SSD and 8 x 6TB pool drives (volume1) in the 24 bay server (I assume it'll resume resilvering upon boot up)
2) create a new 8x 6TB pool with all SAS drives (volume2) mirroring the outgoing volume1 setup (2 vdevs of 4 drives in RAIDZ1... or other suggestion)
3) replicate the volume1 pool/dataset to the volume2 pool/dataset (again, while volume1 likely still resilvering)
4) export the volume1 pool, physically remove volume1's drives, rename the volume2 pool to volume1
5) put the TrueNAS boot SSD and now all SAS based volume1 pool drives back in the mini-ITX chassis

The hope is to drastically speed up the process of replacing all SATA drives. Is the above possible and will it be significantly faster than waiting for the remaining 7 drives to resilver?

Hmmm... removing the 24 bay server from the rack to swap in the current TrueNAS installation boot drive is looking to be physically not possible at the moment. Instead, can I...:

1) use the 24 bay server's current Ubuntu installation and install ZFS
2) export/disconnect volume1 pool from TrueNAS installation
3) physically migrate volume1 pool's drives over to 24 bay server, import volume1 pool in Ubuntu installation
4) create volume2 pool (all SAS drives), and re-create datasets on volume2, mirroring volume1's datasets (there's no zvols, thankfully)
5) copy each datasets' contents using rsync (or is there a way to replicate datasets via CLI???)
6) physically migrate volume2 pool drives to TrueNAS box, import volume2 pool and finally rename pool to volume1

Is that possible?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Some answers:
1) Yes, just make sure the feature set is compatible with the enabled features on your "volume1" pool
2) Shutdown to power off state TrueNAS
3) Yes
4) You don't re-create the datasets, just the pool, with the same feature set as "volume1"
5) Don't use rsync, use ZFS replication, something like "zfs send volume1@snap | zfs receive volume2"
This preserves all the datasets, their permissions and such.
6) Yes, you can migrate "volume2"s disks to TrueNAS box, though I would rename it first:
- Export "volume1", the source
- Remove "volume1"s disks
- Export "volume2"
- Re-import "volume2" as "volume1"
- Export the new "volume1"
- Shutdown the Ubuntu server
- Move disks from Ubuntu server to TrueNAS box

Once last note, ZFS on the TrueNAS box will be mildly confused as it used to know which disks made up the pool. Should not be a problem. Simply re-import the new "volume1".
 
Top