Performance Tuning Help

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Hi Everyone,

I've been experimenting with my home TrueNAS rig for the past several weeks, mostly out of curiosity as I begin planning deployments for business use cases in my professional life.

The TLDR, Am I doing something wrong? I am seeing little difference in performance between the old Supermicro server and my new Dell server. I am seeing little difference in performance between my two pools.

I know I'm not using mirrors, and this is not a performance-oriented build, it is a storage-focused one. However, I am not seeing any benefit from all of the additional RAM, L2ARC and SPECIAL devices??

I have a Dell R720 running TrueNAS. It has:
  • 2x Intel Xeon E5-2620
  • 196GB of ECC Reg DDR3 1333
  • Dell Perc H710 flashed to IT mode
  • LSI 9207-8e
  • LSI 9205-8i
  • Chelsio T520-TO-CR card (LACP LAG to a Brocade VDX 6740)
Disks are setup in the follow way

1625792429972.png

  • 12x 4TB Western Digital Red CMR drives are in an HP D2600 Shelf connected to the LSI 9207-8e,
    • Raid Z2, pool "prod"
    • 2x Samsung SM 953 480 GB setup as a Special VDEV for 1MB
    • 2x Samsung 850 EVO 500GB drives as L2ARC (connected to the LSI 9205)
      1625792354298.png
  • 6x 10TB Western Digital Easy Shucks
    • Raid Z1, pool "backup"
    • Runs on PERC H710 IT mode card
      1625792393399.png
The system runs a replication task for each ZVOL and Dataset from "prod" to "backup". There is one additional Dataset on "backup" because "prod" does not have enough space for that data.

Anyway, I've built this system to replace a Supermicro server with a Supermicro X10SLL-F, E3-1265L v3 and 32GB of ram. Prior to the upgrade, there was no flash in the system and it had significantly less memory. SMB shares appear to be no more, or less, performant than before. ISCSI performance is marginally faster (sync=off)

Additionally, the SMB shares on the "prod" pool (top) are no faster than the "backup" pool (bottom). This is coming from my 10-gigabit (Intel X540-T1) NIC:
1625791972771.png


Atto performance is similar. The really is no discernable difference in performance between the two pools. I get that this is not apples to apples, but really, I was expecting more from the "prod" pool. In some cases it's slower than the "backup" pool, and faster in others.
1625794314924.png

1625794346014.png


Additionally, L2ARC hit rate is abysmal. I cannot explain the gap at noon.
1625792319682.png

1625792152186.png
 
Last edited:

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
L2ARC hit rate is abysmal
Which may indicate that your ARC is already enough and actually losing the L2ARC might make ARC a little better, so I would do that if I were you.

At 32GB RAM, you werre probably way too low on ARC to be thinking about gaining efficiency with such a large L2ARC in the old server.

Depending on your data, you might consider using those SSDs as a metadata VDEV instead (if you have large numbers of small files). Be careful that you already have a special VDEV in a mirror attached to a RAIDZ2 pool, so you may lose the pool if both SSDs are lost, which is not in line with RAIDZ2 which will tolerate 2 disks lost.

Those performance tools you're using aren't really measuring anything of value other than single client performance.

You shouldn't be expecting anything good in terms of IOPS from a RAIDZ pool s as you're looking at sustained IOPS of a single HDD only (maximum 200) per VDEV/pool.

You can see ATTO is getting "fake" IOPS numbers from writes to RAM for the smaller tests.

From what I can see from the tools (relatively poor tests of real perfoemance that they are), you're not doing anything wrong other than expecting more IOPS and throughput than will come from a pool that's RAIDZ.

To see the performance boosts from what you did, YOu need to be re-reading the same content over and over or doing a lot of Async writes.

Also, have you tuned your network for 10G? jumbo frames, etc.
 

NickF

Guru
Joined
Jun 12, 2014
Messages
763
Which may indicate that your ARC is already enough and actually losing the L2ARC might make ARC a little better, so I would do that if I were you.

At 32GB RAM, you werre probably way too low on ARC to be thinking about gaining efficiency with such a large L2ARC in the old server.

That does not track with my testing, however. Previously, I had added a 256GB L2ARC to this R720, with only 64GB of RAM. The same issue persisted, but when I added more RAM, the system happily filled and my hit ratios were still in the 90s, but L2ARC usage stayed more or less the same... under 30% or 0.

Depending on your data, you might consider using those SSDs as a metadata VDEV instead (if you have large numbers of small files). Be careful that you already have a special VDEV in a mirror attached to a RAIDZ2 pool, so you may lose the pool if both SSDs are lost, which is not in line with RAIDZ2 which will tolerate 2 disks lost.
I already AM doing that...
As far as a 1 way mirror vs RAID z2, that is a risk which I feel is fair, since I have a RAID Z2 of 12 disks, tolerating more than one disk loss is more important than a a mirror of 2 flash devices.

Those performance tools you're using aren't really measuring anything of value other than single client performance.

You shouldn't be expecting anything good in terms of IOPS from a RAIDZ pool s as you're looking at sustained IOPS of a single HDD only (maximum 200) per VDEV/pool.

You can see ATTO is getting "fake" IOPS numbers from writes to RAM for the smaller tests.
Can you recommend something I can virtualize across my VMWare host to test from multiple clients?

From what I can see from the tools (relatively poor tests of real performance that they are), you're not doing anything wrong other than expecting more IOPS and throughput than will come from a pool that's RAIDZ.
Can you recommend a different metric by which I can produce data to determine what changes will yield a better or worse performance uplift? In other words, I'm trying to find when adding an L2ARC actually appears useful, because I have been yet unable to find a way to actually yield useful results in any of my pools. Same for SPECIAL VDEVs.

I already understand that mirrors are faster than RAID Z, but what I don't know, nor do I see anyone here discussing, is how to tune RAID Z to make it faster. From my perspective, the cost of mirrors is far too high in many use cases, pricing ZFS and TrueNAS out of the game vs other storage vendors. If RAID Z can be faster if you need it to be, that makes ZFS even more appealing. Losing half of your capacity per shelf, rather than ~17% per shelf does not make sense if you are not looking for IOPS, but just fast access.

To see the performance boosts from what you did, YOu need to be re-reading the same content over and over or doing a lot of Async writes.

Also, have you tuned your network for 10G? jumbo frames, etc.
I understand how ARC/L2ARC work, and again, any meaningful way of testing these changes would be helpful.
 

QonoS

Explorer
Joined
Apr 1, 2021
Messages
87
That does not track with my testing, however. Previously, I had added a 256GB L2ARC to this R720, with only 64GB of RAM. The same issue persisted, but when I added more RAM, the system happily filled and my hit ratios were still in the 90s, but L2ARC usage stayed more or less the same... under 30% or 0.
There are some interesting ZFS parameters that you should have look into, for example:

vfs.zfs.l2arc_write_max: 134217728 # up to 128MB per feed for caching streaming workloads
vfs.zfs.l2arc_write_boost: 134217728 # up to additional 128MB per feed for caching streaming workloads
vfs.zfs.l2arc_noprefetch: 0 # 0: cache streaming workloads
vfs.zfs.l2arc_norw: 0 # 0: read and write at the same time


My suggestion is start reading here at "L2ARC discussion". There are some defaults values that are based on certain assumptions in using L2ARC and they probably do not fit to your usecase. There is also lots of further in Info to find when googling "freebsd zfs l2arc tuning".
Also keep in mind: The longer the server runs the better the L2ARC is filled. One reboot and everything starts from the beginning. There is a feature coming for persistent L2ARC but I am not sure if it is implemented already.

For me things improved a lot just with these 4 settings. Your mileage may vary.
 
Last edited:

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
From my perspective, the cost of mirrors is far too high in many use cases, pricing ZFS and TrueNAS out of the game vs other storage vendors. If RAID Z can be faster if you need it to be, that makes ZFS even more appealing. Losing half of your capacity per shelf, rather than ~17% per shelf does not make sense if you are not looking for IOPS, but just fast access.
Well IOPS and "fast access" have a pretty strong correlation ;-).

As to pricing: ZFS was/is made for the enterprise market. And if you look at the cost of proprietary storage systems, even loosing 50% of your raw capacity is still a real bargain.
 
Top