Replacing 12TB drives with 20TB drives - Resilver impossibly long

pchangover

Cadet
Joined
Nov 29, 2022
Messages
5
My TrueNAS box is setup on an old IBM Cloud Object Storage Slicestor 3448 with Xeon E5-2603v3, 128gb ECC, dual SAS9300-8i controllers and a single RAIDZ2 vdev pool with 12 drives. Currently there are 10 easystore WD white label 12tb drives, 1 18tb whitelabel drive that I had to use to replace a bad 12tb drive and the single 20tb wd red pro drive I just installed.

When I put the new drive in on Monday morning I didn't expect it to take so long and apparently I have 2 more days to go! My plan to upgrade each drive individually simply won't work with this time constraint. Any ideas what's going on? Part of me feels like maybe I should go ahead and build my replacement NAS device now and put the 20tb drives in there and just move the data over but this all feels broken to me.

1669756015503.png
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
I'm currently replacing all the 4Tb drives in my existing pool, and it's taking 12 to 14 hours per device. I only have the one vdev, so I'm proceeding one device at a time. I do notice that the speed seems to vary with the amount of free memory, drive RPM, and interface speed. Two of the devices I replaced were 7200 rpm on 6Gbps SAS controller, and they went quicker that the 5900 rpm device on a motherboard SATA II port I'm currently working on. But the biggest factor I noticed was idle time. If I keep VM workloads & my Zoneminder feed going, it seems to distract the the pool and slow the process. I idled the VM's and switched ZM to modect only, and saw some improvement.

But consider, my current best rate is ~4Tb in just shy of 12 hours. So 20 Tb... 5 x 12... 60 hours per device would match what I'm currently seeing. Not that I wouldn't like it to go faster... :confused:
 
Last edited:
Joined
Jan 7, 2015
Messages
1,155
Yep those disks will take a long time, and your also correct when doing a resilver i shut down all services to quicken as fast as possible. More disk activity the slower it goes. If any of the disks are SMR it will also be very slow.
 

pchangover

Cadet
Joined
Nov 29, 2022
Messages
5
Thanks for the replies. No SMR disks in my pool and activity is basically non-existent as nobody has used plex in that time frame. I'm currently at 71% and still have 28 hours remaining. Absolutely wild! When I resilvered an 18tb in the other week it only took around 28 hours total. If this trend holds it will take almost 100 hours for a single disk which is just not sustainable to swap out all 12 like this. Since I have room I'm going to undo this and make a second pool with the new disks and move the data over. Makes me sad that this probably won't work for me in the future.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
For each drive you replace, the whole VDEV has to be read and the the new content generated and written to the new drive.

Its a slow and reliable process.

Replicating data to a new NAS is also slow. However, its probably faster than resilvering 10 separate drives.
 

MisterE2002

Patron
Joined
Sep 5, 2015
Messages
211
Do you have the lastest scale/core and updated the pool to the latest version? In later version the zfs team improved the resilvering speed.
 

MisterE2002

Patron
Joined
Sep 5, 2015
Messages
211
Just to be picky: is it?
I was under the impression that only used space is read and rewritten.
Yes, that is the reason (i guess) why you instantly can create a pool without re silvering all (non existing) data. And if it was dropped out of the pool and re-inserted it can smartly detect the missing parts i noticed. In the past i used linux mdadm and that needed to build the complete empty disk.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
yes, it has to read only the data on the vdev...
 

pchangover

Cadet
Joined
Nov 29, 2022
Messages
5
Do you have the lastest scale/core and updated the pool to the latest version? In later version the zfs team improved the resilvering speed.

I'm on TrueNAS-13.0-U3.1 CORE and I believe my pool is updated. The resilver finished after almost 5 days lmao. Now the question is do I see if it's just as bad for the next drive?
 

pchangover

Cadet
Joined
Nov 29, 2022
Messages
5
Update - offlined one of the old disks, replaced it with a new 20TB and I'm watching the resilver time climb. Up to an estimated 10 days, 7 hours right now lmao. Going to cancel this I think and build a second pool maybe. This is impossible.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Your 12 drive wide vdev is also part of the problem. Users in the past have reported extremely long resilver times with such a wide vdev and that was before we had the huge drives available that we do now.
 

pchangover

Cadet
Joined
Nov 29, 2022
Messages
5
Your 12 drive wide vdev is also part of the problem. Users in the past have reported extremely long resilver times with such a wide vdev and that was before we had the huge drives available that we do now.

Yea, you can blame me being a novice for that one. Would you recommend a 2x6 vdev setup with RAIDZ2?
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Yea, you can blame me being a novice for that one. Would you recommend a 2x6 vdev setup with RAIDZ2?
It would be better than a single 12 wide vdev. What you do is ultimately up to you.
 
Top