what performance to expect with raidz2 on 12 x 10TB Ironwolf Pro Nas ?

rich1

Dabbler
Joined
Aug 20, 2021
Messages
18
ssd has been filling up nicely and the zpool iostat says it's reading 200MB/second - but the net output hasnt changed - as if the ssd read was wasted.
make anysense ?
 

Attachments

  • zpooliostat.2021-10-09 17-55-50.png
    zpooliostat.2021-10-09 17-55-50.png
    100.9 KB · Views: 230
  • arcstat.2021-10-09 18-11-11.png
    arcstat.2021-10-09 18-11-11.png
    35.7 KB · Views: 220

rich1

Dabbler
Joined
Aug 20, 2021
Messages
18
l2 arc hit ratio eventually reach 90% after I limited the working set to strictly less than the l2arc size.
the ssd is now being read at 480MB/sec which is close to its read limit.
i guess if i wanted to saturate the 10gig nic I would need two of them - is that sound reasoning ?
 

Attachments

  • arc-hit-ratio.2021-10-09 18-49-02.png
    arc-hit-ratio.2021-10-09 18-49-02.png
    35.1 KB · Views: 218
Last edited:

andersla

Cadet
Joined
Oct 2, 2022
Messages
1
Hi @rich1,
Thanks for sharing all details of your build and experience!
I am just about building a very similar server as the one you describe here for my research group in Sweden. I will have about 20 users analysing images. I am looking at ca 12 18TB exos drives + One 4TB Nvme ssd drive.
Could you maybe share some experiences from your setup?

What are the most important zfs parameters to tweak for performance?
How would you layout the disks? Are you running r1, r2 r3?
How much RAM do you think I should aim for? 64GB 128? 256?
Is it enough with 4TB ssd for l2 arc? Should I get a bigger? Or smaller?
Is the dual 8-core processor a good choice? More or less?
All the very best, Anders
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
One 4TB Nvme ssd drive.
How much RAM do you think I should aim for? 64GB 128? 256?
Is it enough with 4TB ssd for l2 arc? Should I get a bigger? Or smaller?

Be aware if that you are intending this as L2ARC. you probably need in the range of 512GB RAM. A ratio of 5:1 L2ARC:RAM is advisable, with ratios of up to 10:1 being possible if the stats show that it is working well. L2ARC is not magic and you need data to be cached in ARC for some time to determine which blocks are going to be most effectively evicted to the L2ARC. Failing to have sufficient RAM leads to cache thrashing.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
I think that's the first time I've seen concrete ARC/L2ARC ratio recommendations! Thanks!!!
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I think that's the first time I've seen concrete ARC/L2ARC ratio recommendations! Thanks!!!

I assure you that it is anything but concrete. Proper sizing of ARC/L2ARC is an art that is dependent on your workload, desired performance improvement, etc.

The game you're playing is that you're trading off fast things for each other. The L2ARC requires a pointer record to be stored in the ARC that indicates what is stored in the L2ARC. This used to be 180 bytes per record years ago but was reduced to 70 in a rewrite. This doesn't really mean that ZFS can handle "twice the L2ARC" though. The important quality that many n00bs (including some portraying themselves as ZFS experts on Reddit, etc) miss is that the quality of L2ARC caching that ZFS is capable of is highly dependent on the workload and the ability of the ARC to properly classify buckets of traffic into MFU and MRU. If you read fifty blocks from pool but your ARC is so small that most of these are evicted before a second read occurs, you don't get a good idea of which of these blocks would have been good L2ARC candidates, and just a random set of them end up written out to L2ARC.

What you really need is sufficient ARC to hold a nice percentage of your working set of blocks. And then you can look to see what percentage of that would benefit from L2ARC. Once you can tell the blocks that are read five times from the blocks that are only read twice, your ARC works better and makes better classification choices.

So people get all bent out of shape about the ARC:L2ARC ratio here, but from my perspective, the interesting ratio is that of the working set (the blocks you access semi-frequently) to the size of the ARC and L2ARC. This is too complex for most people to wrap their heads around, especially if new to ZFS. That's the value of some rule of thumb guidance.

I've often been criticized for my strong recommendations to avoid L2ARC below 64GB of RAM, and certainly there are going to be situations where the working set is small enough that an L2ARC could be useful down there at 32GB or even 16-24GB. But that's not the usual situation. Usually by the time you get up to 64GB of RAM, unless you have a massive pool with massive working set size, you're likely to be seeing at least some traffic in the ARC as reasonably good candidates for L2ARC eviction. Once you get to this bootstrap point, you then have some helpful numbers to guide you beyond that.

Once you've got that handle on things, it becomes much easier to set goals. For some time, I had an iSCSI target host on which I wanted most reads to be satisfied from L2ARC. It's kinda freaky to sit there in "zpool iostat 1" and see no pool reads for long stretches, and just a steady stream of write traffic. The L2ARC on that system was "probably" too large, but it offered maximum performance for both reads and writes. That was 256GB RAM with 1TB L2ARC on a 14TB (~7TB usable) pool, just to give you an idea.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,909
On a somewhat related note, the situation is comparable to sizing and performance tuning in general. You need to know your workload, whether it is about storage (see above) or other things like business transaction processing. This is where at least 80% of people/organizations already fail in my experience (which is not primarily about storage but the aforementioned transactional systems). There is typically knowledge only about the number of transactions per 24 hour window. But that is meaningless, unless you know for certain that the load is evenly distributed (which it never is). And what about end-of-year business in retail? What about weekends vs. workdays, etc.?

The other, less obvious, challenge that people have is the ability to reproduce the load. It is not trivial to simulate 200 clients accessing a NAS with random I/O that has a certain read/write ratio. SMB is perhaps(?) easier to achieve than iSCSI or NFS, but that is pure speculation on my part. The bottom line is that even with basically unlimited hardware it is not a simple undertaking. I am not sure how it is today, but 15 to 20 years the big hardware vendors basically rented out equipment in publicly accessible performance testing labs. So you could rent a bunch of UNIX boxes that cost e.g. 1 million USD and test your application for bottlenecks that would only show up on such gear.

I realize this is getting a bit off-topic. But I wanted to underscore the importance of @jgreco's point that knowing the workload is absolutely key.
 
Top