Build/config sanity check, and question about limiting ARC.

MGYVR

Dabbler
Joined
Mar 14, 2021
Messages
16
Background/build info:

Dell R730xd (12x 3.5" front bays, 2x 2.5" rear bays)
-Dual Xeon 12core CPUs (w/ HT), 48 logical cores
-64GB ECC RAM
Boot pool
-2x 120GB SATA SSDs mirrored, connected to HBA330 via rear flex-bays
-System data set stored on this pool
Fast pool
-2x 1TB NVMe SSDs mirrored, connected to a 4x M.2 NVMe PCIe 16x expansion card, bifrucated 4x4x4x4
-Syncthing plugin/jail with access to the bulk pool.
-Unifi Controller plugin/jail.
-2-3x Windows VMs, one always on, and 1-2 more that don't need to be running all the time (probably 4GB RAM for each VM)
Bulk pool
-4x Seagate Exos 16TB SAS drives in a single Raidz2 pool, connected via HBA330 front bays
-1x 256GB NVMe SSD as cache drive (from the PCIe expansion card detailed above)
-Basically just a giant SMB share

Essentially this will host a large SMB share (on the bulk pool), and host 2-3x Windows VMs on the fast pool. The SMB share will not get a ton of activity, only a hand full of users working with it, maybe copying in/out less than 10GB/day total. One of the VMs will access the SMB and run pdf search indexing on the relevant files in the SMB share. I will also be running Syncthing and a Unifi Controller as plugins/jails on the fast pool. Syncthing will have access to the large SMB share for syncing off site. The focus is this build is maintaining responsiveness for the Windows machines accessing/reading/writing the SMB share, server will have 4x 1Gbps connections LAG'd into a Unifi switch with plenty of switching capacity to make use of it, and I expect it will rarely have more than 4x+ users reading/writing data to the SMB concurrently. We built this server with the expectation that we will eventually upgrade the network to give this server a faster connection, but that upgrade isn't happening immediately.

I have basically all of this setup and working already, but after extensive testing and reading a ton of documentation I still have a few questions:

1. Should I enable write caching on the 16TB drives in the system BIOS? It defaulted to off when I first installed them, and I have read conflicting reports about the use of write caching with TrueNAS.

2a. I am a little confused about RAM use reporting in TrueNAS. the dashboard tile says I have ~10GB free, but when I go to the VMs screen the text at the top on that view says I only have 0.10 bytes free, and it won't let me boot a VM that is already configured, what's up with that?
2b. One or two of the VMs won't need to be launched at boot, but I need to be able to launch them on demand periodically. I know the ZFS paradigm of free RAM being wasted RAM, but I'm willing to waste ~12GB RAM for this. I am aware of the tuneable to limit the ARC size, would that be the best way to solve this dilemma, and recommendations on size given the system specs (looking at 4GB per VM)? Does this system need more RAM in general?

3. I have one additional 256GB NVMe SSD that I'm not presently using for anything. Any suggestions? The two 256GB SSDs were actually acquired due to a purchasing mistake, but the expense was small enough that they don't need to be returned if a use for them can be found.

Thanks for any responses to my questions, and any input toward a general sanity check on this build!
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
1. Yes, write cache on drives is safe to enable, and ZFS may have done it by itself upon the drives being imported into a pool. ZFS controls write barriers and explicitly sends flush commands to drives, so it can handle the drive level write cache.

2a. I'll have to defer to someone else, as I've never used bhyve virtualization under TrueNAS, always preferred external hypervisors.

2b. Limiting ARC max is the most convenient way to do it. Yes, you could try to remember to tune for the reduced RAM at power-on time of the VMs, but if you're willing to sacrifice the 12GB all the time to guarantee that RAM is free for VM use "on-demand" then the way you've set it up is good. Re: extra RAM, not necessarily needed but it won't hurt things.

3. Could use it as a second cache drive for your SMB bulk pool. L2ARC headers are cheap to store now so 512G is entirely reasonable for a 64G RAM machine.

Question for you; what's driving the choice of Z2 on a 4-drive pool? You do get better redundancy but it makes expansion more challenging (and costs you a little bit of performance, though you said it's probably not relevant here.) Is it possible to go to 6-wide Z2 out of the gate for better space efficiency? That way when you end up with all 12 bays full you get 8 drives worth of space instead of 6 (which you could achieve with mirrors)
 

MGYVR

Dabbler
Joined
Mar 14, 2021
Messages
16
Thanks for all the input!

Honestly, the 4x drives in the bulk pool, and the two extra 256GB SSDs were both client purchasing errors (confusion about two parts list revisions and a mixup between this and another unrelated project). The bulk pool will be more than large enough to meet the need for the time being, and due to the old server filling up scary fast the new one just needs to be deployed asap. This client is already planning to do some significant network upgrades and likely build another server by the end of the year so shuffling the data/drives around to revise this pool later shouldn't be much of an issue.
 

MGYVR

Dabbler
Joined
Mar 14, 2021
Messages
16
Turns out that ARC using up all available RAM seems to be the result of the initial Syncthing ingest (20Mbps sustained from off-site data). I thought I had paused the Syncthing ingest for several hours at one point to test, and still had no available RAM for VMs, but I guess I could be wrong, or I'm still not clear on how long ARC will hold onto that data in RAM, I would think that if Syncthing was holding onto RAM that would be in the services category on the RAM dashboard tile, but not 100% sure on that either. Still TBD if RAM availability to boot VMs long after system start will be an issue under normal use but I'm cautiously optimistic that it won't be. Presently at idle (with everything running, but no actual disk IO) there is 5.5GB of RAM 'free' on the dashboard tile, but in the VMs screen is saying there's 58GB 'available'.

Is there a resource/guide that details specifically what the RAM use numbers on the dashboard tile, vs the numbers in the RAM reporting view, vs the 'Available RAM' at the top of the VMs screen each mean? I've read a ton of community posts mentioning the individual specific metrics independently with a basic explanation, and read through the main user guide, but I guess I need a little more comparative explanation of what the different values from those three sources specifically mean to really wrap my head around it...
 

MGYVR

Dabbler
Joined
Mar 14, 2021
Messages
16
I keep seeing and experiencing people not using built in TrueNAS VMs, I have some previous experience with FreeNAS, but this is my first TrueNAS core build and by far my largest and most complicated True/FreeNAS build, and honestly it's the first time I've tested/used Windows VMs in True/FreeNAS personally. I did a bunch of testing with various versions of the Fedora People/Red Hat Virtio drivers for windows. And as soon as I worked my way back to virtio-win-0.1.173 drivers (storage and NIC drivers) my test VMs were damn near perfect, snappy, fast under load, never unresponsive even under 100% CPU load and nearly maxed out RAM consumption, no BSODs or lockups. Only little quibble I've noticed is that there is a decent amount of UI lag when opening File Explorer, but once you're in it's hella quick folder to folder. Same results with Win10Pro and Windows Server 2012 R2. I was surprised and impressed given how reluctant people seem to be about TrueNAS built in VMs, but based on a week of various testing it seems perfectly production ready to me.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
ARC will fill itself with whatever the most recently or most frequently used data is - if ARC is empty, then it will "fill from writes" so to speak. It will hold onto it until either more frequent/recent data pushes it out of ARC, or if the ZFS module is asked to free memory from ARC (due to underlying system pressure).

For the VM RAM reporting I believe that it is seeing the 58G "available" including the ARC that can be shrunk under pressure (although in practice it's often better to not have ARC and bhyve competing) but again you'll have to find someone who's using that feature a little more. bhyve is a good single-host hypervisor and perfectly stable for home use, but right now it's more beneficial for me to run a VMware cluster.
 
Top