Will Dormann
Explorer
- Joined
- Feb 10, 2015
- Messages
- 61
Hi folks,
I've done an experiment where I had 100 VMs all concurrently using an NFS share exported by FreeNAS. It resulted in timeouts galore. I then realized that there's a "Number of servers" configurable setting in the NFS share options. This defaults to 4. The FreeNAS documentation indicates that this setting should not exceed the number of CPU cores on a system.
I'm having a hard time finding out what an optimal value for this setting is, though. A random Cisco guide says: "If you plan to have more than four people connecting to shares on FreeNAS using this protocol, you can increase the number of servers. However, it's recommended that you only have up to six users/servers at one time." This doesn't make too much sense to me, and the use of "people" as the metric makes me trust the guidance even less.
Another guide says: "A rule of thumb is to have at least 4 servers for each client plus a few extras. You set this value as a runtime flag to nfsd (you use the -n flag), use this formula:
(Number-of-clients + 1) * 4.". Which would give me 404 NFS servers in my case. And the FreeNAS web gui won't allow a setting of larger than 256.
One more guide indicates that on Linux the default is 8, but you could try something like 20 if you have a lot of clients. It also says that you can confirm the number of threads you currently have by using the ps command and counting the nfsd instances. In my testing, this doesn't seem to apply to FreeNAS/FreeBSD, though. Regardless of the number of servers setting, ps shows something like:
# ps aux | grep nfs
root 57807 0.0 0.0 14064 1788 ?? Is 10:10AM 0:00.00 nfsuserd: master (nfsuserd)
root 57808 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57809 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57810 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57811 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57880 0.0 0.0 108268 31244 ?? Is 10:10AM 0:00.10 nfsd: master (nfsd)
root 57881 0.0 0.0 9916 5616 ?? I 10:10AM 0:00.10 nfsd: server (nfsd)
root 16470 0.0 0.0 16284 1884 12 S+ 11:29AM 0:00.00 grep nfs
At first, I thought that perhaps the FreeNAS web gui wasn't actually setting the value properly. However, running sysctl vfs.nfsd.threads reflects the value specified in the "Number of servers" setting in the web gui. fs.nfsd.minthreads and vfs.nfsd.maxthreads are also set to the same value.
The questions I have at this point are:
I've done an experiment where I had 100 VMs all concurrently using an NFS share exported by FreeNAS. It resulted in timeouts galore. I then realized that there's a "Number of servers" configurable setting in the NFS share options. This defaults to 4. The FreeNAS documentation indicates that this setting should not exceed the number of CPU cores on a system.
I'm having a hard time finding out what an optimal value for this setting is, though. A random Cisco guide says: "If you plan to have more than four people connecting to shares on FreeNAS using this protocol, you can increase the number of servers. However, it's recommended that you only have up to six users/servers at one time." This doesn't make too much sense to me, and the use of "people" as the metric makes me trust the guidance even less.
Another guide says: "A rule of thumb is to have at least 4 servers for each client plus a few extras. You set this value as a runtime flag to nfsd (you use the -n flag), use this formula:
(Number-of-clients + 1) * 4.". Which would give me 404 NFS servers in my case. And the FreeNAS web gui won't allow a setting of larger than 256.
One more guide indicates that on Linux the default is 8, but you could try something like 20 if you have a lot of clients. It also says that you can confirm the number of threads you currently have by using the ps command and counting the nfsd instances. In my testing, this doesn't seem to apply to FreeNAS/FreeBSD, though. Regardless of the number of servers setting, ps shows something like:
# ps aux | grep nfs
root 57807 0.0 0.0 14064 1788 ?? Is 10:10AM 0:00.00 nfsuserd: master (nfsuserd)
root 57808 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57809 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57810 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57811 0.0 0.0 14064 1792 ?? S 10:10AM 0:00.00 nfsuserd: slave (nfsuserd)
root 57880 0.0 0.0 108268 31244 ?? Is 10:10AM 0:00.10 nfsd: master (nfsd)
root 57881 0.0 0.0 9916 5616 ?? I 10:10AM 0:00.10 nfsd: server (nfsd)
root 16470 0.0 0.0 16284 1884 12 S+ 11:29AM 0:00.00 grep nfs
At first, I thought that perhaps the FreeNAS web gui wasn't actually setting the value properly. However, running sysctl vfs.nfsd.threads reflects the value specified in the "Number of servers" setting in the web gui. fs.nfsd.minthreads and vfs.nfsd.maxthreads are also set to the same value.
The questions I have at this point are:
- What is the optimal value for "Number of servers"?
- If it's something like == # of CPU cores, then perhaps could FreeNAS be smarter about its defaults?
- What is the harm in having a too-high value for "Number of servers"? The FreeNAS documentation states to not exceed the number of CPU cores, however it's not clear what the consequences are. Lower performance? Wasted RAM? Something more catastrophic?
- Is having vfs.nfsd.minthreads == vfs.nfsd.maxthreads == vfs.nfsd.threads perhaps non-optimal in some way?