VMware NFS performance bad - alternatives?

Kosta

Contributor
Joined
May 9, 2013
Messages
106
Hello,

OK, a little history first (note: home-setup):
Long time ago I had a single NAS only. Then I got my 4U server, put many disks in it, installed virtual TrueNAS Core on ESXi, and via HBA, connected all disks to TrueNAS.
Now, I am trying to find a best solution to access data on these disks.

At first, I started with simple SMB and NFS shares. While that worked, I had issues with permissions when some tools copied stuff from my external NAS (Synology) to TrueNAS. Besides, I like LUN better.
And idea was born to use iSCSI and prepare a target for a single windows server. I knew I could either connect ESXi to it or Windows directly.
At first I tried ESXi, put a VMDK on it, and copied couple of TB worth of data onto it. Well, fair performance, was working well.
But one issue: if I reboot the server, manually, and then let VMs auto-start, I would have to rescan adapters, so that my windows file server can even start, otherwise the datastore in ESXi (that is bound to TrueNAS iSCSI target) remains disconnected. Only after manual rescan after TrueNAS has booted, would I have my datastore, and file server would start.

Then I had an idea to create a 2nd data store on TrueNAS, over NFS. Migrated the disk from iSCSI to NFS.
Now I have terrible performance. Writes at about 20-30MB/s, reads only 90MB/s.

Now I strongly believe that the only viable solution is to bind windows server directly to the iSCSI target, completely omitting ESXi.

Or do you have any other ideas?

Thanks
Kosta
 
Joined
Dec 29, 2014
Messages
1,135
I use NFS to share a data store with ESXi, and I can get 4-5Gb write and 14-16Gb read. The key with NFS is RAM, an SLOG, and pool construction. I have 256G of RAM, an Intel Optane NVMe card for an SLOG, and 8 vdevs with 2 drive mirrors. You can see all the specs in my signature. Please include your hardware configuration, software version, and how your pool(s) are constructed (output of a zpool status -v).
 

Kosta

Contributor
Joined
May 9, 2013
Messages
106
I don't have any doubt that with better hardware comes better performance.
I am merely wondering why is NFS performance about 1/10th of iSCSI, at least here.
I don't have your kind of hardware, my TrueNAS has 16GB of RAM, it is a virtual machine on a single Intel 4110 Supermicro server which has total of 128GB of RAM. My VDEV consists of 10 2TB 5400rpm disks, in 2 VDEVs, each 5 disks. All disks hanging on two HBAs.
I did some tests with newly created LUNs and NFS shares, and each time had same picture. LUN however did perform a little better with more stable throughput when selecting higher sector size, 64k or 128k.

Code:
pool: Int_Data
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:03 with 0 errors on Sun Oct 15 00:00:03 2023
config:

        NAME                                          STATE     READ WRITE CKSUM
        Int_Data                                      ONLINE       0     0     0
          gptid/bee0c8a1-f598-11ec-b7ec-000c29fc4700  ONLINE       0     0     0

errors: No known data errors

  pool: SVDEV01
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 06:35:16 with 0 errors on Mon Oct  9 08:35:23 2023
config:

        NAME                                          STATE     READ WRITE CKSUM
        SVDEV01                                       DEGRADED     0     0     0
          gptid/67662eea-cb9a-49dd-890a-d6dc8e6ed5dc  DEGRADED     0     0     0  too many errors

errors: No known data errors

  pool: VDEV01
 state: ONLINE
  scan: scrub repaired 0B in 02:06:41 with 0 errors on Mon Oct  2 04:06:41 2023
config:

        NAME                                            STATE     READ WRITE CKSUM
        VDEV01                                          ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/4f1c74f0-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4f0789f1-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4f11ba87-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4eb0d0f1-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4f4adde3-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/4f420163-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4f54fc98-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4ef21e6c-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4f37797e-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0
            gptid/4f5c3053-2a1f-11ee-a4ab-005056893d5c  ONLINE       0     0     0

errors: No known data errors

  pool: boot-pool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:00:19 with 0 errors on Tue Oct 17 03:45:19 2023
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          da0p2     ONLINE       0     0     0

errors: No known data errors
 
Joined
Dec 29, 2014
Messages
1,135

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I am merely wondering why is NFS performance about 1/10th of iSCSI, at least here.

You have lots of learning to do... start with


My VDEV consists of 10 2TB 5400rpm disks, in 2 VDEVs, each 5 disks.

And then move on to


I did some tests with newly created LUNs and NFS shares, and each time had same picture.

before you move on to


which is my way of saying you're doing it all wrong and it isn't shocking that this isn't working well for you.
 

Kosta

Contributor
Joined
May 9, 2013
Messages
106
Thank you, after many months I tend to forget, I guess years are catching up with me. I set up the current system some time ago. I believe I had given TrueNAS back then more RAM, but since then took it away, since I didn't upgrade the server to 256GB.

So, if I understand it correctly, the best course of action in my case (lots of large media files, videos, photos and music, then some documents, and not much else, basically only a big archive) is to keep the current 5+5 system (as I don't want to sacrifice much of space on parity, and still have a possibility to expand disks to 15 or 20), go direct NFS shares (no iSCSI), and high record size (currently running with 128k). With this I am getting reasonable write performance at around 300MB/s. VMs (VMDKs) are running only from built-in NVMEs in the server.
 
Last edited:
Top