11.2 to 13 update- Nice Perf Gain (vmware NFS) - w stats / charts


I recently completed a upgrade of my SIG main FN system, from 11.2-U5 ->13.0u3.1) (is a physical system, TrueNas only - ie TN is not virtualized, and within TN i do NOT run ANY jails nor VMs) -> thus TN is only used for storage.

I wanted to post some performance increases i have seen, as i was looking for exactly a thread like this, when i was considering even doing this update process several weeks ago. (and i could not find anything solid/non-synthetic benchmarks RE: 11.2 -> 13 TN upgrade).
(im aware ZFS / TN is difficult to benchmark, in the traditional/synthetic sense)

My main consumer of IOPS is my vSphere / vmware "cluster" (is not a real/true cluster, but rather a set of 3x SM x10 based hosts, all with 10g -> NFS on TrueNas).

My setup does use alot of IOPS, with the main write IOPS load coming from:

+ 4x VMs running: Graphite / influxDB / Prometheus -> grafana stack. ( i can provide more details on this if others are interested, stats come into this setup from many different / external sources ).​
+ Splunk VM (uses my nvme mirror via NFS -> TN)​
+ more sequential type IOPS are used by:​
6x Milestone xProtect NVR VMs (each VM is recording ~8x IP-cameras)​
** the Milestone xProtect VMs in MOST cases are recording to VM-host direct attached disks (not TN), but then 1x a day archives the days video over to TN (same, via NFS).​
+ Veeam Backups (to TN, via NFS and SMB)​

IMPORTANT: i have still NOT upgraded my pools' zfs feature flags yet, incase issues on 13 pop up, although there have been no issues thus far , so i maybe leaving some perf. on the table in this sense)

the most important metric i feel is the NFS disk latency as reported by vSphere, that the hosts are experiencing TO the TN datastores (the images / graphs below) :

some notes on the 60-day graphs below:
* for about 10days prior to the 11.2 to 13 upgrade, i was doing a 1 by 1 disk "upgrade" of 2x of my pools (swapping their 8tb disks for 16 tb disks) so that did add quite a bit of load / latency due to the many re-silvers that had to be done for each disk in.​
* my other post linked below, might also be relevant to the increase performance / decreased latency post 11.2 -> 11.3 FN upgrade (TLDR: my ssd pools were seeing CONSTANT delete IOPS, *possibly* due to TRIM being enabled by default on 11.2, but disabled by default on 13)​

below are 60day graphs, the blue dashed line is when the 11.2 -> 13 upgrade was completed, these stats are pulled from vSphere (1st img is all vms):








some more data points, showing an imporvment , still from vmware, but these are pulled by vROPS (vrealize ops manager)
( i still think some of the improvment is from the large decrease in constant DELETE IOPS i was seeing on any of my flash pools).

The date of the FN 11.2 to 13.1u3 update was Jan 29 2023 (each new line is a single VM, and im only showing stats from VMs that have NFS -> TN disks) :