Growing vdev via disk replacement issues?

MikeyG

Patron
Joined
Dec 8, 2017
Messages
442
Are there any issues with expanding vdevs via disk replacement where those issues are based on the original size of the disks in the vdev? In other words, let's say I have an 1TB X 8 disk RAIDZ2, and I updated those disks to 4TB, then to 8TB, and then one day to 20TB over time. Does the starting size of the vdev and how it's initially laid out cause any problems once expanded 10 - 20X it's original size?
 

JaimieV

Guru
Joined
Oct 12, 2012
Messages
742
ZFS structures scale up to 2-to-the-power-128 bytes. 1TB is 2-to-the-power-40 bytes, 8TB only takes you to 2^43. There's a lot of headroom there.

Layout depends on intended function, but starting in RAIDZ2 is a fair choice as larger disks have larger resilvering times so there's more risk. You'll likely want to add some RAM as you go along, but that's about it.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Does the starting size of the vdev and how it's initially laid out cause any problems once expanded 10 - 20X it's original size?
There's potentially a risk here when you start talking about that many multiples of size.

When ZFS is given a disk, it carves it up into smaller chunks called "metaslabs" which it uses for various purposes; the note though is that it carves in sizes of 2^N, much like ashift values, and it tries to get as close to (but not over) 200 metaslabs as possible.

So for a 1TB disk, you end up with 125 slabs of 8GB in size.

However, metaslab size doesn't change when you increase the vdev size. So if you go to 10TB, you end up with the same 8GB slab size, but you have 1250 of them. 20TB and that's 2500 per disk.

The housekeeping and behavior of ZFS under conditions like this aren't well-documented or well-observed. It's entirely possible that the impact is limited to needing more CPU and RAM for juggling the higher slab count, or it could result in some weird overflow or nasty condition. The "slab target" of 200 was put there for some reason (although we should probably try to track down the original coders to see if they can explain exactly why that number was chosen) and going that far off-target might cause weird behavior.

By that time though we may be into another situation similar to how ashift=9 performs abysmally on a 4Kn disk though, which can only be resolved by a new pool creation.
 

MikeyG

Patron
Joined
Dec 8, 2017
Messages
442
Thanks @HoneyBadger that was pretty much the answer I was looking for. In googling metaslabs there does seem to be some concern about space usage problems in some cases, but I couldn't find anything about performance. At least not that I could decipher as a lot of what's written is a bit over my head. Sounds like it's not a big problem to worry about for now.
 
Top