Need some help on NFS performance issue

Jeremy Guo

Dabbler
Joined
Jul 28, 2023
Messages
37
Dear all

I need some help.

My nas enviroment is 2xIntel Xeon 4116, 256G ECC RAM, 6x960G SSD, and 60x16T HDD connecting through 2x3008 HBA

2x960G as Read cache, 2x960G mirror as log cache, 2x960G mirror as metadata,as showed in the screenshoot. Truneas version is TrueNAS-SCALE-22.12.3.1

I am using NFS service only, nothing else is running on the server.

My problem is, after consistent reading or writing over 10TB data via NFS, the NFS performance is degraded to less then 10Mb and even lower, nfsd is running on high CPU usage.

After rebooting the nfs service, the performance become to normal.

So any idea or suggestion on this?

thanks in advance.
 

Attachments

  • 222.PNG
    222.PNG
    46.2 KB · Views: 144
  • 1111.PNG
    1111.PNG
    40.3 KB · Views: 149
  • 3333.PNG
    3333.PNG
    51.3 KB · Views: 130

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
and 60x16T HDD connecting through 2x3008 HBA
How are these arranged in your pool?

I see only 1 single RAIDZ2 VDEV in the screenshot.

If that's the case, it's worth mentioning that 10-12 wide is considered the limit for a RAIDZ VDEV (not 60).

Also, depending on the nature of your use of NFS (e.g from VMware), you may be doing sync writes, which will easily overrun the IOPS capacity of your pool (which would equate to a single disk's IOPS capacity as you appear to have it set up)
 

Jeremy Guo

Dabbler
Joined
Jul 28, 2023
Messages
37
How are these arranged in your pool?

I see only 1 single RAIDZ2 VDEV in the screenshot.

If that's the case, it's worth mentioning that 10-12 wide is considered the limit for a RAIDZ VDEV (not 60).

Also, depending on the nature of your use of NFS (e.g from VMware), you may be doing sync writes, which will easily overrun the IOPS capacity of your pool (which would equate to a single disk's IOPS capacity as you appear to have it set up)

Hi, sretalla

thanks for your reply.

Yes, that's one VDEV with 59 HDD and 1 spare. unfortuately now I have data on it and can't change it. Before I setup the VDEV, I searched on the internet, 10-12 is recommend, but I didn't see 60 is bad, so I put it in one VDEV.

I am using the NFS to backup data by using Veeam, the weired thing is , the nfs performace is recovered when I restart the NFS service immediately , seems the problem is not related with the ZFS system itself, not sure why nfsd is running high CPU usage.

thanks
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,700
10-12 is recommend
10-12 is the MAXIMUM recommended by these forums. The OpenZFS project mentions no more than 16 should be used as resivler times will become excessive when the pool is (closer to) full.

Anyway, that doesn't necessarily deal with your problem in any case as I suspect you simply lack enough IOPS to cope.

That causes NFS to have a bunch of requests outstanding that ZFS hasn't finished digesting, so NFS is just eating the problem and grinding to a halt... restarting it just starts the problem again from 0.

Without a pool redesign, you're not going to fix that.
 

Jeremy Guo

Dabbler
Joined
Jul 28, 2023
Messages
37
thank you very much, sretalla
10-12 is the MAXIMUM recommended by these forums. The OpenZFS project mentions no more than 16 should be used as resivler times will become excessive when the pool is (closer to) full.

Anyway, that doesn't necessarily deal with your problem in any case as I suspect you simply lack enough IOPS to cope.

That causes NFS to have a bunch of requests outstanding that ZFS hasn't finished digesting, so NFS is just eating the problem and grinding to a halt... restarting it just starts the problem again from 0.

Without a pool redesign, you're not going to fix that.
 
Top