TrueNAS 12 - Ideas for increased NFS performance?

Joined
Dec 29, 2014
Messages
1,135
Here is my bare metal TrueNAS box:
TrueNAS-12.0-U8.1
Cisco C240 M4SX - 24 x 2.5" SAS/SATA drive bays
Dual E5-2637 v4 @ 3.50GHz
256 GB ECC DRAM
2 x Intel i350 Gigabit NIC for management (using 2 ports in LACP channel)
Chelsio T580-CR dual port 40 Gigabit NIC for storage network (using 1 port)
LSI 3108 based HBA
Storage pool = 8 mirrored vdevs of 2 x 1TB 7.2k SATA drives with one spare drive
Intel Optane 900P SLOG

The workload I am trying to speed up is for an NFS share to an ESXi 7.0 host. I don't have a lot of VM's running normally (4-5) and I am pretty please overall with the performance. See Vmotion stats below.
1656269366719.png

Read is pretty darn good. It stays pretty close to 10Gb when reading off the data store. I am only getting around 4Gb write. Obviously it is doing synch writes, but I have an NVMe SLOG to address that. Any ideas for what I could do to make the writes perform better without going to an all SSD config?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
What is the workload.?.... small I/Os or few outstanding writes will limit performance.
Testing with fio or similar will let you know what the NAS is capable of.

Just doing the back of the envelope calculations.
8 mirrored vdevs.. => 500 Mbit per vdev or per drive = 62MB/s. It's not terrible, especially with 2.5" SATA HDDs
Mirrors are higher performing, but they aren't better for high bandwidth (large IO), writes.

When reading with Mirrors, you get 2X performance plus any cache benefits.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
I'd like to see same tests without sync=always, to see what level of capping we see out of the SLOG.
I'd then proceed by removing the SLOG from the pool, create a new pool on the device and run some tests.
I'd direct attention toward the SLOG, and validate the device will perform higher numbers in your system in a stand alone setup.
As tricky as it is to do validation benchmarks on ZFS, there are a bunch of caveats you probably know of, to ensure the tests is not being 'hijacked' by ZFS-magnificency ;)
 
Joined
Dec 29, 2014
Messages
1,135
What is the workload.?.... small I/Os or few outstanding writes will limit performance.
It is an NFS shared data store for ESXi. I think the 256G of RAM is what makes the most difference for reads. 2 of the normally running VM's are pretty small, but I bet the bulk of the stored stuff is in RAM. In a previous version of hardware I couldn't get to 1Gb write speed without the SLOG. The SLOG made a HUGE difference and the stats on my 900P are right up there with anything short of an NVDIMM.

@Dice I posted stats on the Optane 900P early on in thread by @HoneyBadger on throughput for SLOG devices. I am more interested in my real world write scenarios since those are the things I am staring at. I am 100% happy with where I am from a read standpoint. Sustaining almost 10Gb with bursts past 15Gb is more than good enough. The writing is what could use some improvement. That said, I can Vmotion my production VM's in under 10 minutes, so it is hardly terrible.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
How full is the pool overall? 500MB/s of sustained writes is certainly nothing to sneeze at for an 8-vdev mirror setup on SATA. That's pretty much "empty pool" performance on my 8-drive SAS setup (4x2-way vdevs)

What's the recordsize for the NFS datasets, and what kind of latency are you seeing for writes? I've generally targeted 32K record/volblock for my ESXi workloads as this provides a good mix of good sequential writes, compression efficiency around 1.35x for a generic Windows load, and moderate read-modify-write to existing records.
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
I am more interested in my real world write scenarios since those are the things I am staring at.

I believe it is necessary to at validate the differential between how much the SLOG is sagging down your pool's write performance - or if it doesn't. That is one tick box to click, in the dataset options. All must start there, it is the simplest and most telling test there is.

Not being willing to validate that, makes me wonder what else of suggestions you would consider enough 'real world' to you.
 
Joined
Dec 29, 2014
Messages
1,135
How full is the pool overall?
51% used.
500MB/s of sustained writes is certainly nothing to sneeze at for an 8-vdev mirror setup on SATA.
That is good to know. I still feel like I am not all that good at figuring out the storage performance metrics.
What's the recordsize for the NFS datasets
128K. I built new pool when I moved to the new hardware.
Not being willing to validate that, makes me wonder what else of suggestions you would consider enough 'real world' to you.
I think I phrased that poorly. What I meant was that I couldn't even get 1Gb write on my old hardware and old pool construction (2 RAIDZ2 vdevs of 8 drives) without the SLOG, so I know that is a HUGE benefit in my use case. I am certainly willing to do system tunable or change the construction of the SLOG. I am 100% convinced of the benefit of the SLOG.

Edit: I have run fio tests on the pool in the past, but I guess I don't feel like I know what the best tests are to run. I remember playing with this and not understanding the results very well. I did even read the man pages on it, but it just didn't click.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
51% used.

Do a zpool list and look at the FRAG% column, which indicates free space fragmentation. The higher that is, the harder it is to find contiguous free space to write into (which affects how quickly your transaction groups can flush to disk)

That is good to know. I still feel like I am not all that good at figuring out the storage performance metrics.

Dusting off an old favorite here from Adam Leventhal. Open an SSH session, use vi or nano to create a file called dirty.d and paste this text in there. You could also do it on a client system and then copy it to your pool, but you'll then need to get to that same pool directory.

Code:
txg-syncing
{
        this->dp = (dsl_pool_t *)arg0;
}

txg-syncing
/this->dp->dp_spa->spa_name == $$1/
{
        printf("%4dMB of %4dMB used", this->dp->dp_dirty_total / 1024 / 1024,
            `zfs_dirty_data_max / 1024 / 1024);
}


Then from a shell do dtrace -s dirty.d YourPoolName and wait. You'll see a bunch of lines that look like the following:

Code:
dtrace: script 'dirty.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  4  56342                 none:txg-syncing   62MB of 4096MB used
  4  56342                 none:txg-syncing   64MB of 4096MB used
  5  56342                 none:txg-syncing   64MB of 4096MB used


Start up a big copy and watch the numbers in the "X MB of Y MB used" area. If you look like you're using more than 60% of your outstanding data, you're getting write-throttled which means your vdevs can't keep up with the data ingest rate. If you never throttle, the 900p might be slowing you down (which can happen - it's a good device, but it's not infallible)

128K. I built new pool when I moved to the new hardware.

This could be partially responsible if you're seeing poor latency and response times on your running VMs. If you svMotion, you're doing it at 64K granularity, which will result in a 64K record being written to the new dataset. But now if you need to update a small chunk (eg: 4K) inside there, you're going to read-modify-write that 64K. Make a new dataset with a 32K recordsize and test a latency-sensitive workload there perhaps?

I think I phrased that poorly. What I meant was that I couldn't even get 1Gb write on my old hardware and old pool construction (2 RAIDZ2 vdevs of 8 drives) without the SLOG, so I know that is a HUGE benefit in my use case. I am certainly willing to do system tunable or change the construction of the SLOG. I am 100% convinced of the benefit of the SLOG.

Edit: I have run fio tests on the pool in the past, but I guess I don't feel like I know what the best tests are to run. I remember playing with this and not understanding the results very well. I did even read the man pages on it, but it just didn't click.

I think what's being asked is to try a dataset with sync=disabled just to see if the 900p is a bottleneck anywhere. The dtrace script will help suss this out as well, but making a test dataset and flipping sync=disabled will tell you pretty fast if the svMotion speeds go above the 4Gbps marker. Obviously don't run live VMs with that disabled if you care about data integrity, but it'll tell you that you need a faster SLOG if that matters - of course, you still need vdevs that can keep up with the ingest rate.

But I'll put money on your disks just being unable to sustain more than 500MB/s aggregate at their current fill/frag rate. Fill the other bays if you want to go a little faster, otherwise it's time for a trip to NAND-town.
 
Joined
Dec 29, 2014
Messages
1,135
Do a zpool list and look at the FRAG% column, which indicates free space fragmentation.
13%
I think what's being asked is to try a dataset with sync=disabled just to see if the 900p is a bottleneck anywhere.
OK. I am hesitant to do that on the NFS pool since that seems like it could jeopardize my VM's if I have a perfect storm disaster during the testing.
Then from a shell do dtrace -s dirty.d YourPoolName and wait. You'll see a bunch of lines that look like the following:
OK, I'll check that out next time I do a Vmotion.
But I'll put money on your disks just being unable to sustain more than 500MB/s aggregate at their current fill/frag rate.
That sounds reasonable. My secondary TN has an external bay with some faster 3.5" SAS drives, and it gets better write speeds. All the other hard configuration is the same. I think my CPU, RAM, etc are all good. If I want any kind of boost from here, sounds like I need to pay a visit to the SSD fairy. :smile:
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
There is a reported large performance jump for NFS on TN 13
 
Joined
Dec 29, 2014
Messages
1,135
There is a reported large performance jump for NFS on TN 13
Interesting. I was giving that time to bake before upgrading. Old IT guy suspicious nature in play there. :smile:
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
I am 100% convinced of the benefit of the SLOG.
Yes - me too!
I have run fio tests on the pool in the past, but I guess I don't feel like I know what the best tests are to run. I remember playing with this and not understanding the results very well. I did even read the man pages on it, but it just didn't click.
I did too a year ago or so. I wrapped up a little script to run a series of tests and dump them into a logfile.
Unfortunately I lost it during a faint moment of realizing a dataset was not backuped as I expected.

Since you've brought this performance 'issue' to the light again, my interest has awakened.
Maybe there is a time to re:invoke such a script again to easily make a couple reproducible tests.

For reference, I've a 3xmirror SSD pool that I use for VM's and 10GbE. Shortly I'll have a 4801x SLOG to play with too.
I'm looking forward for myself to see what sort of performance jump the SSDs are compared to yours.

I think what's being asked is to try a dataset with sync=disabled just to see if the 900p is a bottleneck anywhere.
That is true, but not really the point I was trying to make, albeit intertwined.
Since I'm too quite convinced the SLOG device used is all the way adequate for this scenario, I'm more doubting the underlying pools performance capability. My point was to remove the SLOG out of the equation, by like you indicate, running tests on a sync=disabled dataset.
If there would be for whatever reason a huge discrepancy between SLOG + sync=always and noSLOG + sync=disabled, the SLOG needs more attention.

Also thnx to @HoneyBadger for some great commands to have a look at. I'll do some comparisons on my system too.
 
Joined
Dec 29, 2014
Messages
1,135
Since you've brought this performance 'issue' to the light again, my interest has awakened.
Maybe there is a time to re:invoke such a script again to easily make a couple reproducible tests.
I'd be very interested to hear about that. Dating myself here, but I can remember using sar on Unix boxes way too old to mention looking for performance issues.
Since I'm too quite convinced the SLOG device used is all the way adequate for this scenario, I'm more doubting the underlying pools performance capability.
When @HoneyBadger first started keeping a forum log of SLOG devices, the 900P was right up near the top. I have had it several years, and have been quite happy with it. I am starting to suspect the disk drives may just be at the limit of their performance. My secondary unit has an external array with 8 vdevs of 3.5" SAS drives and it gets better performance on the workload. Realizing I probably need better drives is a bit of a bummer, but I think I suspected as much. That said, it does do everything I need it to do. I had long resisted redoing the pool with mirrors because of the available space that was lost, but that definitely helped too. The extra ram these servers have is a big plus too. I really love these numbers in my ARC hit rate!
1656405757230.png
 
Joined
Dec 29, 2014
Messages
1,135
Here is my workload test again. These are the sizes of the 4 VM's I am moving around.
Code:
root@freenas2:/mnt/MIRROR-I/VMWare # du -m PCNS_vapp_en ASA-V alfred-2015 vcenter65-vCSA-7.0.3-19234570
1741    PCNS_vapp_en
1604    ASA-V
107831  alfred-2015
72134   vcenter65-vCSA-7.0.3-19234570

It took under 10 minutes to move them off, and under 10 minutes to move them back. I was moving them to a local data store in the ESXi 7 host.
1656407406774.png

ARC Info still looks good to me.
1656407455656.png

1656407487594.png

Here is the disk IO on the SLOG device.
1656407528481.png

Here is the dtrace output during the move back to the NFS data store.
Code:
dtrace: script '/mnt/MIRROR-I/CIFS-I/elliot/dirty.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  4  75698                 none:txg-syncing    4MB of 4096MB used
  4  75698                 none:txg-syncing  819MB of 4096MB used
  5  75698                 none:txg-syncing  835MB of 4096MB used
  6  75698                 none:txg-syncing  819MB of 4096MB used
  5  75698                 none:txg-syncing  819MB of 4096MB used
  6  75698                 none:txg-syncing  819MB of 4096MB used
  1  75698                 none:txg-syncing  819MB of 4096MB used
  1  75698                 none:txg-syncing  927MB of 4096MB used
  4  75698                 none:txg-syncing  879MB of 4096MB used
  1  75698                 none:txg-syncing 1346MB of 4096MB used
  7  75698                 none:txg-syncing  348MB of 4096MB used
  3  75698                 none:txg-syncing 1057MB of 4096MB used
  7  75698                 none:txg-syncing 1240MB of 4096MB used
  2  75698                 none:txg-syncing 1244MB of 4096MB used
  6  75698                 none:txg-syncing 1278MB of 4096MB used
  6  75698                 none:txg-syncing 1640MB of 4096MB used
  5  75698                 none:txg-syncing 2040MB of 4096MB used
  5  75698                 none:txg-syncing 2740MB of 4096MB used
  7  75698                 none:txg-syncing 2773MB of 4096MB used
  4  75698                 none:txg-syncing 2732MB of 4096MB used
  3  75698                 none:txg-syncing 2825MB of 4096MB used
  2  75698                 none:txg-syncing 2831MB of 4096MB used
  6  75698                 none:txg-syncing 2802MB of 4096MB used
  4  75698                 none:txg-syncing 2174MB of 4096MB used
  5  75698                 none:txg-syncing 2743MB of 4096MB used
  5  75698                 none:txg-syncing 2444MB of 4096MB used
  6  75698                 none:txg-syncing 2515MB of 4096MB used
  5  75698                 none:txg-syncing 2367MB of 4096MB used
  4  75698                 none:txg-syncing 2272MB of 4096MB used
  5  75698                 none:txg-syncing 2194MB of 4096MB used
  5  75698                 none:txg-syncing 1989MB of 4096MB used
  5  75698                 none:txg-syncing 1838MB of 4096MB used
  7  75698                 none:txg-syncing 2180MB of 4096MB used
  6  75698                 none:txg-syncing 2730MB of 4096MB used
  0  75698                 none:txg-syncing 2632MB of 4096MB used
  6  75698                 none:txg-syncing 2820MB of 4096MB used
  0  75698                 none:txg-syncing 2818MB of 4096MB used
  5  75698                 none:txg-syncing 2811MB of 4096MB used
  3  75698                 none:txg-syncing 2162MB of 4096MB used
  7  75698                 none:txg-syncing 1872MB of 4096MB used
  7  75698                 none:txg-syncing 1789MB of 4096MB used
  4  75698                 none:txg-syncing 1840MB of 4096MB used
  4  75698                 none:txg-syncing 1840MB of 4096MB used
  5  75698                 none:txg-syncing 1828MB of 4096MB used
  5  75698                 none:txg-syncing 1851MB of 4096MB used
  7  75698                 none:txg-syncing 1776MB of 4096MB used
  6  75698                 none:txg-syncing 1832MB of 4096MB used
  7  75698                 none:txg-syncing 1819MB of 4096MB used
  4  75698                 none:txg-syncing 1862MB of 4096MB used
  5  75698                 none:txg-syncing 1944MB of 4096MB used
  6  75698                 none:txg-syncing 1846MB of 4096MB used
  0  75698                 none:txg-syncing 2727MB of 4096MB used
  4  75698                 none:txg-syncing 2828MB of 4096MB used
  7  75698                 none:txg-syncing 2815MB of 4096MB used
  7  75698                 none:txg-syncing 2863MB of 4096MB used
  6  75698                 none:txg-syncing 2759MB of 4096MB used
  7  75698                 none:txg-syncing 2816MB of 4096MB used
  5  75698                 none:txg-syncing 2461MB of 4096MB used
  7  75698                 none:txg-syncing 2304MB of 4096MB used
  5  75698                 none:txg-syncing 2049MB of 4096MB used
  4  75698                 none:txg-syncing 1924MB of 4096MB used
  7  75698                 none:txg-syncing 1841MB of 4096MB used
  1  75698                 none:txg-syncing 1732MB of 4096MB used
  6  75698                 none:txg-syncing 1787MB of 4096MB used
  1  75698                 none:txg-syncing 1826MB of 4096MB used
  6  75698                 none:txg-syncing 1608MB of 4096MB used
  4  75698                 none:txg-syncing 1424MB of 4096MB used
  6  75698                 none:txg-syncing 1378MB of 4096MB used
  4  75698                 none:txg-syncing 1297MB of 4096MB used
  5  75698                 none:txg-syncing 1170MB of 4096MB used
  6  75698                 none:txg-syncing 1646MB of 4096MB used
  7  75698                 none:txg-syncing 1828MB of 4096MB used
  4  75698                 none:txg-syncing 1832MB of 4096MB used
  3  75698                 none:txg-syncing 1849MB of 4096MB used
  1  75698                 none:txg-syncing 1852MB of 4096MB used
  2  75698                 none:txg-syncing 1868MB of 4096MB used
  5  75698                 none:txg-syncing 1821MB of 4096MB used
  4  75698                 none:txg-syncing 1848MB of 4096MB used
  0  75698                 none:txg-syncing 1880MB of 4096MB used
  5  75698                 none:txg-syncing 2726MB of 4096MB used
  4  75698                 none:txg-syncing 2820MB of 4096MB used
  5  75698                 none:txg-syncing 2739MB of 4096MB used
  4  75698                 none:txg-syncing 2790MB of 4096MB used
  3  75698                 none:txg-syncing 2837MB of 4096MB used
  6  75698                 none:txg-syncing 2688MB of 4096MB used
  7  75698                 none:txg-syncing 2734MB of 4096MB used
  0  75698                 none:txg-syncing 2054MB of 4096MB used
  5  75698                 none:txg-syncing 1937MB of 4096MB used
  5  75698                 none:txg-syncing 1690MB of 4096MB used
  4  75698                 none:txg-syncing 1690MB of 4096MB used
  2  75698                 none:txg-syncing 2053MB of 4096MB used
  4  75698                 none:txg-syncing 1871MB of 4096MB used
  5  75698                 none:txg-syncing 2394MB of 4096MB used
  6  75698                 none:txg-syncing 2758MB of 4096MB used
  5  75698                 none:txg-syncing 2900MB of 4096MB used
  5  75698                 none:txg-syncing 1725MB of 4096MB used
  4  75698                 none:txg-syncing 1632MB of 4096MB used
  6  75698                 none:txg-syncing 1244MB of 4096MB used
  6  75698                 none:txg-syncing 1263MB of 4096MB used
  4  75698                 none:txg-syncing 1369MB of 4096MB used
  3  75698                 none:txg-syncing 1893MB of 4096MB used
  5  75698                 none:txg-syncing 2089MB of 4096MB used
  5  75698                 none:txg-syncing 1934MB of 4096MB used
  5  75698                 none:txg-syncing 1674MB of 4096MB used
  4  75698                 none:txg-syncing 2188MB of 4096MB used
  4  75698                 none:txg-syncing 2616MB of 4096MB used
  4  75698                 none:txg-syncing 2739MB of 4096MB used
  1  75698                 none:txg-syncing 1835MB of 4096MB used
  6  75698                 none:txg-syncing 1481MB of 4096MB used
  6  75698                 none:txg-syncing 1296MB of 4096MB used
  6  75698                 none:txg-syncing 1512MB of 4096MB used
  4  75698                 none:txg-syncing 1299MB of 4096MB used
  1  75698                 none:txg-syncing 1194MB of 4096MB used
  4  75698                 none:txg-syncing 1339MB of 4096MB used
  3  75698                 none:txg-syncing 1436MB of 4096MB used
  5  75698                 none:txg-syncing 1264MB of 4096MB used
  4  75698                 none:txg-syncing 1105MB of 4096MB used
  3  75698                 none:txg-syncing 1169MB of 4096MB used
  4  75698                 none:txg-syncing 1288MB of 4096MB used
  0  75698                 none:txg-syncing 1136MB of 4096MB used
  7  75698                 none:txg-syncing  979MB of 4096MB used
  5  75698                 none:txg-syncing 1227MB of 4096MB used
  6  75698                 none:txg-syncing 1711MB of 4096MB used
  4  75698                 none:txg-syncing 1511MB of 4096MB used
  0  75698                 none:txg-syncing 1423MB of 4096MB used
  6  75698                 none:txg-syncing 1095MB of 4096MB used
  0  75698                 none:txg-syncing  593MB of 4096MB used
  5  75698                 none:txg-syncing   16MB of 4096MB used
  2  75698                 none:txg-syncing   25MB of 4096MB used
^C

And here are the CPU stats.
1656407643769.png

I think it performs pretty well, but does this shed any light on areas where I could improve it?
 

Jessep

Patron
Joined
Aug 19, 2018
Messages
379
Interesting. I was giving that time to bake before upgrading. Old IT guy suspicious nature in play there. :smile:
Example:

13.0 U1 due within a month I think?
 
Joined
Dec 29, 2014
Messages
1,135
Example:

13.0 U1 due within a month I think?
Interesting. One of the tweaks I made some time back was increasing the number of NFS servers to 8. I could certainly go higher, but I picked that number because that is how many cores I have in my system. Hyperthreading is off in the BIOS. If I didn't say this before, my system is storage only. I do the VM hosting on other servers running ESXi 7.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Here is the dtrace output during the move back to the NFS data store.
Code:
dtrace: script '/mnt/MIRROR-I/CIFS-I/elliot/dirty.d' matched 2 probes
CPU     ID                    FUNCTION:NAME
  4  75698                 none:txg-syncing    4MB of 4096MB used
  4  75698                 none:txg-syncing  819MB of 4096MB used
  5  75698                 none:txg-syncing  835MB of 4096MB used
  6  75698                 none:txg-syncing  819MB of 4096MB used
  5  75698                 none:txg-syncing  819MB of 4096MB used
  6  75698                 none:txg-syncing  819MB of 4096MB used
  1  75698                 none:txg-syncing  819MB of 4096MB used
  1  75698                 none:txg-syncing  927MB of 4096MB used
  4  75698                 none:txg-syncing  879MB of 4096MB used
  1  75698                 none:txg-syncing 1346MB of 4096MB used
  7  75698                 none:txg-syncing  348MB of 4096MB used
  3  75698                 none:txg-syncing 1057MB of 4096MB used
  7  75698                 none:txg-syncing 1240MB of 4096MB used
  2  75698                 none:txg-syncing 1244MB of 4096MB used
  6  75698                 none:txg-syncing 1278MB of 4096MB used
  6  75698                 none:txg-syncing 1640MB of 4096MB used
  5  75698                 none:txg-syncing 2040MB of 4096MB used
  5  75698                 none:txg-syncing 2740MB of 4096MB used
  7  75698                 none:txg-syncing 2773MB of 4096MB used
  4  75698                 none:txg-syncing 2732MB of 4096MB used
  3  75698                 none:txg-syncing 2825MB of 4096MB used
  2  75698                 none:txg-syncing 2831MB of 4096MB used
  6  75698                 none:txg-syncing 2802MB of 4096MB used
  4  75698                 none:txg-syncing 2174MB of 4096MB used
  5  75698                 none:txg-syncing 2743MB of 4096MB used
  5  75698                 none:txg-syncing 2444MB of 4096MB used
  6  75698                 none:txg-syncing 2515MB of 4096MB used
  5  75698                 none:txg-syncing 2367MB of 4096MB used
  4  75698                 none:txg-syncing 2272MB of 4096MB used
  5  75698                 none:txg-syncing 2194MB of 4096MB used
  5  75698                 none:txg-syncing 1989MB of 4096MB used
  5  75698                 none:txg-syncing 1838MB of 4096MB used
  7  75698                 none:txg-syncing 2180MB of 4096MB used
  6  75698                 none:txg-syncing 2730MB of 4096MB used
  0  75698                 none:txg-syncing 2632MB of 4096MB used
  6  75698                 none:txg-syncing 2820MB of 4096MB used
  0  75698                 none:txg-syncing 2818MB of 4096MB used
  5  75698                 none:txg-syncing 2811MB of 4096MB used
  3  75698                 none:txg-syncing 2162MB of 4096MB used
  7  75698                 none:txg-syncing 1872MB of 4096MB used
  7  75698                 none:txg-syncing 1789MB of 4096MB used
  4  75698                 none:txg-syncing 1840MB of 4096MB used
  4  75698                 none:txg-syncing 1840MB of 4096MB used
  5  75698                 none:txg-syncing 1828MB of 4096MB used
  5  75698                 none:txg-syncing 1851MB of 4096MB used
  7  75698                 none:txg-syncing 1776MB of 4096MB used
  6  75698                 none:txg-syncing 1832MB of 4096MB used
  7  75698                 none:txg-syncing 1819MB of 4096MB used
  4  75698                 none:txg-syncing 1862MB of 4096MB used
  5  75698                 none:txg-syncing 1944MB of 4096MB used
  6  75698                 none:txg-syncing 1846MB of 4096MB used
  0  75698                 none:txg-syncing 2727MB of 4096MB used
  4  75698                 none:txg-syncing 2828MB of 4096MB used
  7  75698                 none:txg-syncing 2815MB of 4096MB used
  7  75698                 none:txg-syncing 2863MB of 4096MB used
  6  75698                 none:txg-syncing 2759MB of 4096MB used
  7  75698                 none:txg-syncing 2816MB of 4096MB used
  5  75698                 none:txg-syncing 2461MB of 4096MB used
  7  75698                 none:txg-syncing 2304MB of 4096MB used
  5  75698                 none:txg-syncing 2049MB of 4096MB used
  4  75698                 none:txg-syncing 1924MB of 4096MB used
  7  75698                 none:txg-syncing 1841MB of 4096MB used
  1  75698                 none:txg-syncing 1732MB of 4096MB used
  6  75698                 none:txg-syncing 1787MB of 4096MB used
  1  75698                 none:txg-syncing 1826MB of 4096MB used
  6  75698                 none:txg-syncing 1608MB of 4096MB used
  4  75698                 none:txg-syncing 1424MB of 4096MB used
  6  75698                 none:txg-syncing 1378MB of 4096MB used
  4  75698                 none:txg-syncing 1297MB of 4096MB used
  5  75698                 none:txg-syncing 1170MB of 4096MB used
  6  75698                 none:txg-syncing 1646MB of 4096MB used
  7  75698                 none:txg-syncing 1828MB of 4096MB used
  4  75698                 none:txg-syncing 1832MB of 4096MB used
  3  75698                 none:txg-syncing 1849MB of 4096MB used
  1  75698                 none:txg-syncing 1852MB of 4096MB used
  2  75698                 none:txg-syncing 1868MB of 4096MB used
  5  75698                 none:txg-syncing 1821MB of 4096MB used
  4  75698                 none:txg-syncing 1848MB of 4096MB used
  0  75698                 none:txg-syncing 1880MB of 4096MB used
  5  75698                 none:txg-syncing 2726MB of 4096MB used
  4  75698                 none:txg-syncing 2820MB of 4096MB used
  5  75698                 none:txg-syncing 2739MB of 4096MB used
  4  75698                 none:txg-syncing 2790MB of 4096MB used
  3  75698                 none:txg-syncing 2837MB of 4096MB used
  6  75698                 none:txg-syncing 2688MB of 4096MB used
  7  75698                 none:txg-syncing 2734MB of 4096MB used
  0  75698                 none:txg-syncing 2054MB of 4096MB used
  5  75698                 none:txg-syncing 1937MB of 4096MB used
  5  75698                 none:txg-syncing 1690MB of 4096MB used
  4  75698                 none:txg-syncing 1690MB of 4096MB used
  2  75698                 none:txg-syncing 2053MB of 4096MB used
  4  75698                 none:txg-syncing 1871MB of 4096MB used
  5  75698                 none:txg-syncing 2394MB of 4096MB used
  6  75698                 none:txg-syncing 2758MB of 4096MB used
  5  75698                 none:txg-syncing 2900MB of 4096MB used
  5  75698                 none:txg-syncing 1725MB of 4096MB used
  4  75698                 none:txg-syncing 1632MB of 4096MB used
  6  75698                 none:txg-syncing 1244MB of 4096MB used
  6  75698                 none:txg-syncing 1263MB of 4096MB used
  4  75698                 none:txg-syncing 1369MB of 4096MB used
  3  75698                 none:txg-syncing 1893MB of 4096MB used
  5  75698                 none:txg-syncing 2089MB of 4096MB used
  5  75698                 none:txg-syncing 1934MB of 4096MB used
  5  75698                 none:txg-syncing 1674MB of 4096MB used
  4  75698                 none:txg-syncing 2188MB of 4096MB used
  4  75698                 none:txg-syncing 2616MB of 4096MB used
  4  75698                 none:txg-syncing 2739MB of 4096MB used
  1  75698                 none:txg-syncing 1835MB of 4096MB used
  6  75698                 none:txg-syncing 1481MB of 4096MB used
  6  75698                 none:txg-syncing 1296MB of 4096MB used
  6  75698                 none:txg-syncing 1512MB of 4096MB used
  4  75698                 none:txg-syncing 1299MB of 4096MB used
  1  75698                 none:txg-syncing 1194MB of 4096MB used
  4  75698                 none:txg-syncing 1339MB of 4096MB used
  3  75698                 none:txg-syncing 1436MB of 4096MB used
  5  75698                 none:txg-syncing 1264MB of 4096MB used
  4  75698                 none:txg-syncing 1105MB of 4096MB used
  3  75698                 none:txg-syncing 1169MB of 4096MB used
  4  75698                 none:txg-syncing 1288MB of 4096MB used
  0  75698                 none:txg-syncing 1136MB of 4096MB used
  7  75698                 none:txg-syncing  979MB of 4096MB used
  5  75698                 none:txg-syncing 1227MB of 4096MB used
  6  75698                 none:txg-syncing 1711MB of 4096MB used
  4  75698                 none:txg-syncing 1511MB of 4096MB used
  0  75698                 none:txg-syncing 1423MB of 4096MB used
  6  75698                 none:txg-syncing 1095MB of 4096MB used
  0  75698                 none:txg-syncing  593MB of 4096MB used
  5  75698                 none:txg-syncing   16MB of 4096MB used
  2  75698                 none:txg-syncing   25MB of 4096MB used
^C

The magic number for you here is 2458MB - that's 60% of your dirty_data_max value, where the ZFS write throttle will kick in and start to very gently pull the handbrake on incoming traffic. You're bouncing above it periodically here which is the indicator of "vdevs not fast enough to handle the incoming data"

chart.jpg


You've pretty much hit the limits of the disk as you suspected. When you hit the contiguous free space, you're able to drain faster, but if they have to spend too much time seeking or have competing read I/Os (eg: if you try to svMotion during a backup job that's missing ARC all the time) the laws of physics kick in and end up throttling the writes for a period until the disks can keep up.

One thing that could be done, since you have a large amount of RAM and a big SLOG, is to increase the dirty data max value - basically, let the Optane handle larger bursts of data. It's a bit of a balancing act, as that means a larger "burst" of writes could make your disks busier for longer as it tries to stream it out to them.
 
Joined
Dec 29, 2014
Messages
1,135
The magic number for you here is 2458MB - that's 60% of your dirty_data_max value, where the ZFS write throttle will kick in and start to very gently pull the handbrake on incoming traffic. You're bouncing above it periodically here which is the indicator of "vdevs not fast enough to handle the incoming data"
You da man @HoneyBadger ! Check this out.
Code:
  7  75698                 none:txg-syncing    0MB of 65536MB used
  2  75698                 none:txg-syncing  738MB of 65536MB used
  0  75698                 none:txg-syncing 2268MB of 65536MB used
  7  75698                 none:txg-syncing 2457MB of 65536MB used
  7  75698                 none:txg-syncing 2847MB of 65536MB used
  3  75698                 none:txg-syncing 4013MB of 65536MB used
  4  75698                 none:txg-syncing 6014MB of 65536MB used
  5  75698                 none:txg-syncing 8723MB of 65536MB used
  6  75698                 none:txg-syncing 7020MB of 65536MB used
  6  75698                 none:txg-syncing 5108MB of 65536MB used
  4  75698                 none:txg-syncing 3738MB of 65536MB used
  2  75698                 none:txg-syncing 2894MB of 65536MB used
  4  75698                 none:txg-syncing 3714MB of 65536MB used
  6  75698                 none:txg-syncing 5772MB of 65536MB used
  7  75698                 none:txg-syncing 8921MB of 65536MB used
  4  75698                 none:txg-syncing 10630MB of 65536MB used
  6  75698                 none:txg-syncing 10580MB of 65536MB used
  4  75698                 none:txg-syncing 13785MB of 65536MB used
  6  75698                 none:txg-syncing 14916MB of 65536MB used
  6  75698                 none:txg-syncing 20667MB of 65536MB used
  4  75698                 none:txg-syncing 23412MB of 65536MB used
  4  75698                 none:txg-syncing 28685MB of 65536MB used
  5  75698                 none:txg-syncing 28656MB of 65536MB used
  6  75698                 none:txg-syncing 19645MB of 65536MB used
  4  75698                 none:txg-syncing 1423MB of 65536MB used
  7  75698                 none:txg-syncing   15MB of 65536MB used

Code:
root@freenas2:/mnt/MIRROR-I/VMWare # zpool iostat -v MIRROR-I
                                                  capacity     operations     bandwidth
pool                                            alloc   free   read  write   read  write
----------------------------------------------  -----  -----  -----  -----  -----  -----
MIRROR-I                                        3.62T  3.63T      6    199   864K  14.7M
  mirror                                         442G   486G      0     15   102K  1.11M
    gptid/e39027e8-fe1e-11eb-88a4-5c838f806d36      -      -      0      7  51.6K   570K
    gptid/78fc8b5e-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  50.1K   570K
  mirror                                         446G   482G      0     14   111K  1.26M
    gptid/37008e8c-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  59.5K   646K
    gptid/842c0150-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  51.0K   646K
  mirror                                         443G   485G      0     16   109K  1.29M
    gptid/2a574619-fe1f-11eb-88a4-5c838f806d36      -      -      0      8  56.4K   659K
    gptid/8b834d22-fe1f-11eb-88a4-5c838f806d36      -      -      0      8  52.6K   659K
  mirror                                         441G   487G      0     14   100K  1.18M
    gptid/ae46df27-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  47.6K   605K
    gptid/baea87d8-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  52.8K   605K
  mirror                                         435G   493G      0     15  98.6K  1.21M
    gptid/30495198-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  51.1K   621K
    gptid/598a5fea-fe1f-11eb-88a4-5c838f806d36      -      -      0      7  47.6K   621K
  mirror                                         448G   480G      0     12   108K  1.12M
    gptid/608c2dfd-fe1f-11eb-88a4-5c838f806d36      -      -      0      6  50.8K   574K
    gptid/9e266479-fe1f-11eb-88a4-5c838f806d36      -      -      0      6  57.3K   574K
  mirror                                         523G   405G      0     11   120K  1.25M
    gptid/f4a90804-fe1e-11eb-88a4-5c838f806d36      -      -      0      5  59.2K   640K
    gptid/10949fd4-fe1f-11eb-88a4-5c838f806d36      -      -      0      5  60.5K   640K
  mirror                                         526G   402G      0     13   116K  1.29M
    gptid/e3d10d05-fe1e-11eb-88a4-5c838f806d36      -      -      0      6  57.5K   662K
    gptid/00d7c50e-fe1f-11eb-88a4-5c838f806d36      -      -      0      6  58.7K   662K
logs                                                -      -      -      -      -      -
  gptid/b9443779-fe1f-11eb-88a4-5c838f806d36    12.4M   260G      0     85      3  4.99M
----------------------------------------------  -----  -----  -----  -----  -----  -----

1656426605842.png

I would say I have this pushed to the limits of what my spinning rust can do. Thanks all for the assistance. Now I just need to convince my wife that I need a whole pile of SSD's. :smile:
 
Joined
Dec 29, 2014
Messages
1,135
One last thing. It sure looks to me like my 900P is performing pretty well. Check out the IO stats on that.
1656426865187.png
 
Joined
Dec 29, 2014
Messages
1,135
Really last one! I tested with my secondary unit that has an external array of faster SAS drives, and the numbers are better. Here is the NIC on the secondary.
1656429593655.png

That was Vmotion primary to secondary, then secondary to VM host, back to secondary, and then secondary to primary.
Here are the NIC stats on the primary.
1656429670471.png

That also tells me that the drives and/or RAID controller in the ESXi host are also a bit of a limiting factor since the numbers are better between the two units. Anyway, I am pretty happy now. Thanks again all!
 
Top