Configuration and performance testing

Kosta

Contributor
Joined
May 9, 2013
Messages
106
Well yes, I am aware that the system is less fault tolerant, because it's actually a single drive failure system, unlike RAIDZ2 which I tested beforehand. However, two things:
- all (important) data on TrueNAS will be backed up somewhere else, most likely Synology (which is 8-bay NAS with SHR-2)
- a chance that a single VDEV fails is much smaller with a two-disk configuration, than two drives failing in a 10-disk configuration, just by the calculated chance - and even if - it would just be inconvenient, and I would have to use the spare and rebuild, copy data etc. - but, I have the benefit of higher IOPS, throughput and flexibility

Going 3-way mirror would be just way too ineffective. And middle-way could be 3-way RAIDZ1, not sure about the performance though. The size increase is well, meh. 8,8TB to 10,6TB.
 
Last edited:

Kosta

Contributor
Joined
May 9, 2013
Messages
106
But one thing I don't understand. I can't pinpoint why are my writes to the pool sometimes slower and sometimes faster. I am copying the same large video files from an internal NVME to the pool. And sometimes I see 400 MB/s and sometimes only 200 MB/s. No configuration change. I just delete the files on the pool and then copy again. It happens on all pool configurations, 10 disk RAIDZ2, 5-way mirror, 3-way RAIDZ1...

Any ideas?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I'm not really sure about the VDEV layouts you are writing about, but I assume we are comparing a pool made of 5 2-way mirror VDEVs with one made of a single RAIDZ2 VDEV.
The reability curve is as it follows:
Screenshot_1.png
Doubling that:
Screenshot_2.png
In both graphs individual drive failure probability is on the x axis and pool fail probability is on the y axis.
If we consider actual numbers (the cruves' purpose is to show the different behaviour), the difference appears to be risible, but I have to check my calculations. I will explore this later, especially considering URE.

Any ideas?
Did you exclude caching?
 
Last edited:

Kosta

Contributor
Joined
May 9, 2013
Messages
106
OK, so according to your graph, the probability the whole pool will fail is highest when using more VDEVs with 2-way mirrors?

Did you exclude caching?
Actually did not previously. But I am doing it again. Just to check: disable LZ4 and set primarycaches for both VDEV and LUN to metadata. Or would none be correct?
 

Kosta

Contributor
Joined
May 9, 2013
Messages
106
Baah, I don't get it...
Just made the full striped pool, only 10 single VDEVs. Disabled caches as mentioned above. RAM is 8GB. Try copy the first file, 25GB. 600 MB/s, up until the end. Cancel short before the end, copy again, bam, 210 MB/s, and doesn't even think of going up again. If I try the 2nd file, it's a mixed bag:
1688820567303.png
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
Try rebooting the machine between each test.
Also, try the following resource.
 

Kosta

Contributor
Joined
May 9, 2013
Messages
106
OK, I think I found the reason behind the fast and slow transfers.
For instance, if I copy 100GB of data, it will go fast. If I then simply copy the same data in another folder, without deleting the first, it will go fast.
However, if I delete 1st or 2nd folder (just delete all copied data) and copy the same data again, the transfer-rate will drop a lot. From about 350 MB/s to 150 MB/s.
I then copied the same data again (1st and 2nd folder), which went slow, then copied the same data 3rd time into 3rd folder and the speed picked up again.
So apparently it is something in the ZFS file system, possibly some cleanup task? I think I have to be patient. Trying also to understand how iostat works, currently seeing this without any copy-task:
1688883321670.png


If I understand correctly, the load is currently 41.4%, meaning ZFS is doing something...?
Btw. I also tried 64GB RAM, and while I can see write cache filling and copying the same file again will yield much higher speeds than disk write capability, sooner or later, speed will drop to 350 MB/s.

In any case, I am now on 5+5 disk RAIDZ2, which seems a good middle ground for everything. I can expand up to 20 disks, which is most likely the end stage, and performance is OK. Resilience with two drives is better than mirrored single drives IMO.

And finally, changing to Scale. I like the interface and overviews much better, the performance seems the same.
 
Last edited:
Top