NVMe-Mirror very slow - Any ideas to optimize?

saveZFS

Explorer
Joined
Jan 6, 2022
Messages
87
Hallo,

I added two 512 NVMe SSDs GIGABYTE NVMe SSD M.2 2280 512GB to my Truenas-VM in a ZFS-Mirror.
  • Sequential Read Speed : up to 1700 MB/s
  • Sequential Write speed : up to 1550 MB/s**
The SSDs are passthroughed to the TrueNAS-VM.
My problem is now, that I only get about 100 MB/s write-performance even on big files (>10 GB)
Could this be right or is ther anywhere a problem?
I never used SSDs with ZFS, yet!

Edit:
Maybe it is important, that I used an NFS-Share (ESXi VM on the Share) for the test.
 
Last edited:
Joined
Oct 22, 2019
Messages
3,641
Sounds like the expected speeds over a gigabit network.

 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Alternatively, it also sounds like a pretty good sync-write speed for non-enterprise SSDs, and ESXi writes to NFS volumes in sync mode.

Try disabling sync on the dataset. As a diagnostic, not a recommendation.
 

saveZFS

Explorer
Joined
Jan 6, 2022
Messages
87
Sounds like the expected speeds over a gigabit network.
It's a virtual ESXi switch, so this should not the problem.

Try disabling sync on the dataset. As a diagnostic, not a recommendation.
I made a copy of the dataset with sync disabled and some more tests.

So the speed is much better. I made some screenshots.

Mirror of 2 x GIGABYTE NVMe SSD M.2 2280 512GB with sync = on
mb_sync.JPG

iops_sync.JPG


Mirror of 2 x GIGABYTE NVMe SSD M.2 2280 512GB with sync = off
mbs_nosync.JPG

iops_nosync.JPG


Alternatively, it also sounds like a pretty good sync-write speed for non-enterprise SSDs, and ESXi writes to NFS volumes in sync mode.
Would the speed be much better with enterprise SSDs or do you simply have to make big compromises when writing with ZFS?
And you are of the opinion that the values are generally not so bad? :)
 

Attachments

  • iops_sync.JPG
    iops_sync.JPG
    57.7 KB · Views: 158
Last edited:

saveZFS

Explorer
Joined
Jan 6, 2022
Messages
87
ESXi writes to NFS volumes in sync mode
Could it be the problem, that I set a 'Record Size' of 128 KiB?
Is this a bad size for an ESXi NFS-Datastore or are the SSDs the problem of the "bad" performance?
 

saveZFS

Explorer
Joined
Jan 6, 2022
Messages
87
This weekend I mad some more tests, but the results are very disappointing. It seems my SSDs can't handle sync-writes and ESXi uses sync-writes! :(
I tested a lot of different 'record sizes', but the write performance is still very bad. :( I put my results at the end of this post.

Is there any chance to otimize my write performance on the mirror without deactivating sysc-writes (this is no option for me)?
Or are my SSDs (GIGABYTE NVMe SSD 512GB) the problem and are SSDs for M2.PCIe (2280) with a better performance available?


  • 8KiB
    8KiB-MBs.JPG
  • 16KiB
    16KiB-MBs.JPG
  • 32KiB
    32KiB-MBs.JPG
  • 512KiB
    512KiB-MBs.JPG
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
If you add a high performance SLOG device, then the sync writes will essentially happen at the speed of the SLOG. (they get logged to the SLOG, and then async written as per normal writes)

I am using a 100GB Intel P4801x in my primary server to good effect. I see sync write speeds over 1GB/s, and as a result I have the entire pool set to sync always.

P4801x 100GB Benchmark Results

Code:
Synchronous random writes:
       4 kbytes:     13.1 usec/IO =    298.6 Mbytes/s
       8 kbytes:     17.1 usec/IO =    456.2 Mbytes/s
      16 kbytes:     27.6 usec/IO =    565.7 Mbytes/s
      32 kbytes:     42.5 usec/IO =    735.9 Mbytes/s
      64 kbytes:     72.9 usec/IO =    856.9 Mbytes/s
     128 kbytes:    135.1 usec/IO =    925.3 Mbytes/s
     256 kbytes:    255.5 usec/IO =    978.5 Mbytes/s
     512 kbytes:    509.0 usec/IO =    982.4 Mbytes/s
    1024 kbytes:    972.2 usec/IO =   1028.6 Mbytes/s
    2048 kbytes:   1879.6 usec/IO =   1064.1 Mbytes/s
    4096 kbytes:   3693.3 usec/IO =   1083.0 Mbytes/s
    8192 kbytes:   7320.3 usec/IO =   1092.8 Mbytes/s
 
Last edited:

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
This weekend I mad some more tests, but the results are very disappointing. It seems my SSDs can't handle sync-writes and ESXi uses sync-writes! :(
I tested a lot of different 'record sizes', but the write performance is still very bad. :( I put my results at the end of this post.

Is there any chance to otimize my write performance on the mirror without deactivating sysc-writes (this is no option for me)?
Or are my SSDs (GIGABYTE NVMe SSD 512GB) the problem and are SSDs for M2.PCIe (2280) with a better performance available?



Sync is needed for data reliability... in the case of power failures or system crashes.

Sync forces the system to write data to the ZIL on SSDs before acknowledging to the client. This process adds latency and reduces throughput.
The same data is written twice to the SSDs.. once to the ZIL and then later to the Pool. Adding a SLOG removes this need.

There are several ways to get more bandwidth:

1) Turn off sync (not recommended if you want reliability) or add a SLOG.
2) Add more clients ... if you have many machines, the bandwidth will be higher
3) Increase the number of IOs outstanding... often called Queue depth. QD = 32 is as high as needed.
4) Increase the number of Threads... you only show 1. Normally 8 or 16 are used... its like having more VMs
5) Use larger Io/s....

Crystalmark is not really that good for virtualization.. its more of an SSD or HDD test.

I don't think these Gigabyte SSds are that fast. Smaller M.2 devices tend to have no RAM and hence the latency for a write can be quite high. This reduces performance for sync writes.


In reality, the ESXi workload is a mix of reads and writes.... it's better to focus on that bandwidth number.
 

saveZFS

Explorer
Joined
Jan 6, 2022
Messages
87
There are several ways to get more bandwidth:

1) Turn off sync (not recommended if you want reliability) or add a SLOG.
2) Add more clients ... if you have many machines, the bandwidth will be higher
3) Increase the number of IOs outstanding... often called Queue depth. QD = 32 is as high as needed.
4) Increase the number of Threads... you only show 1. Normally 8 or 16 are used... its like having more VMs
5) Use larger Io/s....
Thank you for your detailed explanation.
Edit: Sorry, I was in the wrong thread! I though I am in this thread: https://www.truenas.com/community/threads/which-m2-pcie-2280-ssd-for-fast-sync-writes.100457/page-3
 
Last edited:

saveZFS

Explorer
Joined
Jan 6, 2022
Messages
87
Increase the number of Threads... you only show 1. Normally 8 or 16 are used... its like having more VMs
I tested my new SSD Intel p4510 first with one thread and now with 16 thrads. The results are much better with 16 threads.
But I don't understand why more threads increase the performance, when the benchmark is running on the same single VM?
Could anyone explain that to me, please?
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
I tested my new SSD Intel p4510 first with one thread and now with 16 thrads. The results are much better with 16 threads.
But I don't understand why more threads increase the performance, when the benchmark is running on the same single VM?
Could anyone explain that to me, please?

Each thread is relatively independent. THink of it like a checkout queue in a supermarket.

The performance issue you are having is mostly due to latency and how quickly the NAS responds to each write requests. Until it gets an acknowledgement, the thread can't issue another write request. High latency slows down the thread and restricts bandwidth

If you have more threads, the sum of the bandwidth for each thread is measured. Latency becomes less of an issue and the actual bandwidth of the SSDs becomes the key issue.

In a virtualization context, each VM is like a different thread. If you have only one VM, you will use less bandwidth in total, but that VM will perform well.

Similarly, to get more bandwidth, the supermarket needs more checkout staff. Its hard to make one checkout queue go faster.
 
Top