Benchmarking ZFS performance on TrueNAS

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
I think if you're really full (only a few bytes left) it's not even possible to add a VDEV, since all pool members need to know about new VDEVs (requiring writes for that to happen).
Yeah, it was just a theoretical. I mean if you're at that point it is obvious you have a problem and that is fine ;) As long as you do not loose the data. Which is what my interpretation was.

I'm not sure that I see how using a fusion pool saves you (although I do agree that you would typically have a much larger metadata VDEV than required which wouldn't necessarily have other data being written to it... I'm not 100% sure that data won't overflow from the data VDEVs to the metadata one when there's not enough space...
As I understood (from a quickscan on the topic, so not an indepth read). A Fusion pool is a vDev specifically for metadata. I would assume and hope no actual data EVER ends up there.

metadata will certainly overflow to the data VDEVs if required, but I'm not clear if there's some kind of prevention mechanism for it going the other way or if it's just "any port in a storm" coded in).
The vise versa of course can and should probably be true (although I can imagine having metadata spread out like that would cause it's own set of issues) however your Fusion pool should always be larger then the total amount of metadata that will be stored there. I would also expect alarm bells start ringing when a Fusion pool starts filling up. If it's at 95% Thor should be bombarding you with lightning to motivate you to resolve the issue asap!
 

nemesis1782

Contributor
Joined
Mar 2, 2021
Messages
105
@HoneyBadger No worries your posts are great! No feeling of condescension and I think you sell your self short calling it rambling ;)

The "archival workflow" is fairly well understood and behaves nicely in ZFS. Dumping big files in, and deleting them rarely or never, tends to work great. Even when you're filling the pool up close to the maximum ("up to but not exactly 100%") capacity, the fill pattern still leaves you with a lot of contiguous free space, and deleting files in large chunks results in a large amount of space being freed at once.
Benchmarking this workflow is easy, but probably will be already well-understood.
That is actually a large part of my data. Which is video I collect but rarely delete. I know I should see a psychiatrist for that but my current solution is cheaper :P

The bench marking is mostly because no one is actually able to give me any proper numbers on performance. Also just for fun of course :P

As soon as you go to smaller granularity, or even worse doing "update-in-place" of files or block devices, the nice contiguous free space gets covered with a finely chopped mixed of those in-place files, as well as ruining the ability to sequentially read from the underlying vdevs. If you write a 1GB file, and then start updating random blocks of 1M in the middle of it, you'll end up with things out of order and have to seek around to read. Do that with ten 1GB files and it gets even worse. Do it with 100 50G .vmdk's worth of data on block storage - and you've basically asked your drives to deliver I/O at random across a 5T span of data.
Yes, that sounds about right.

Now, this "steady state" can absolutely be benchmarked, but the question is "what is the value of that benchmark?"
Hmm. So yeah good question, there are a number of reasons. First off because it's something I haven't done before so it's interesting. Second I have seen many misleading posts about performance in these cases (not intentionally misleading of course), they either use a bad testing methodology or test on a clean zPool. Other complain about it not performing which is often the case of mis configuration/full zPool/Old zPool with loads of fragmentation/or just plain stupidity.

Now with a synthetic benchmark I could simulate my use cases this would allow me to:
- determine the optimum setup
- determine a better metric then 50% or 80% is the max you should fill it (which is advice that honestly gives me the shivers)
- plot performance degradation to fragmentation and fill rates
- create predictions on performance degradation (or penalties) based on fill rate fragmentation and fill rates based on certain use cases
- possibly provide a tool for interested parties of the community (if there are any, always good to give something back)

It will definitely tell us something we already know, that being "spinning disks suck at random I/O." But the question I would ask it "do you really need to hit that 5T span of data at performance-level-X, or do you realistically only need to hit 500G of it that fast?" Because that's where something like a huge ARC (with compression) and L2ARC devices start to come into play. With more hits to your RAM and SSD, and suddenly your spindles have more free time to deliver the I/O requests that miss the cache. Maybe it's a VM datastore or NFS export, you're backing your VMs up nightly or weekly, and you will hit all of that 5T span, you don't care too much that it takes a while as long as it finishes inside your backup window, but it can't tank the rest of your running VM performance.
Agreed. I am going to need to look at what is best for my situation. I think I will have at least 4 zPools at this point:
- Archive like storage, Movies/Series/Binaries/Audio
--> Read is fairly important but realistically doesn't matter much other then I want it to be fast just for that once in a blue moon I want to export it to somewhere or something. It should at least be able to support 3x 4k compressed stream with decent quality audio. Which should be peanuts in most configs
--> Write not really important except for when I want to import a large library from somewhere else. So the only reason is because it would make feel good to be able to have high write speeds :P I mean what else will I brag about to my friends ;)
- Document and photo storage, data protection is key here (this is backupped, but I prefer not to have the hassle to need the backup!)
--> Mostly fairly small files no real need for high speed other then bragging right and the luxury of things going fast
- Download disk
--> Fairly high performance required however deleting and recreating the pool every few months is not and issue and I would probably be able to automate this
- Block storage for VM's and assorted reasons
--> Not really clear on this yet. But probably will be the last thing I add

For my workstations/Desktops I will actually just be adding another SSD and install it in the PC if I need since this is probably cheaper if I need performance.

The only true benchmark is you (or someone with the same workflow) actually using the storage. You can definitely make observations and extrapolations from someone else's experience, but it's difficult to try to "boil it down" to just a single number, graph, or report sheet. Bandwidth, latency, IOPS, the size of the working set, all of this will have to be taken into account. But at the same time it's important to have objective metrics, because what's "fast enough" for one person might be "intolerable" for another.
Agreed. So my long term goal is to make data lake where all this data from many users goes. With this you start to mine the data for commonalities give numbers which aren't in the real of "fast enough" but.
- zPool 2x2TB mirror vdev
-> average speed: 100 MB/s read/write
-> 80% filled 10% fragmentation average: 90MB/s read/write
-> 80% filled 90% fragmentation average: 5MB/s read/write

this way user get actual numbers to determine what choices to make.

I'll see if I can manage to get something more coherent into text to help you out with some workflow and benchmark ideas, but I'd suggest checking the ground already trod by others with tools like VDbench, HCIbench, or diskspd for simulating "real world" setups in a scalable and programmatic scenario.
Nice was sure I wasn't the first one to think of this. Will look into those!

Thnx!
 

Sawtaytoes

Patron
Joined
Jul 9, 2022
Messages
221
I filled multiple zpools up to 100% and was fine. It was filled with snapshots + data. I had separate stripe-only zpools to match the exact same stripes on Windows drive arrays.

As the Windows ones got close to full, the zpools were at or close to 100% capacity because of snapshots.

All pools worked fine at and after 100% usage. I was storing all kinds of data large (videos) and small (JS code projects).

TrueNAS complains about over 80% usage, but that doesn't matter. Nothing says your zpool will die.

Was performance degraded? Not sure. Everything worked as it had. I only noticed they filled up because I got an email. My backup tasks ran in the background, so I wasn't able to monitor them for speed changes.
 
Top