Correct drive options for ESXI+TrueNAS

Hypopheralcus · May 2, 2021

Hey there!

I'm building a new server with TrueNAS on ESXi (amongst other VMs)
(VMs mainly for home appliance - like a home SQL Server for Kodi, some VMs for personal cloud and surveillance/home equipment as well as experimenting VMs)

I just wanted to make sure, that after hours of research and reading in this forum, I'm on the right track about the actual hardware/drive-configuration

So I'm going to run a XEON Silver on a X11SPA-T Supermicro Board

The ESXI-host itself will be stored on a USB flash drive directly on the usb port of the mainboard
TrueNAS will be booted first by ESXi and will be stored on TWO mirrored and dedicated 32GB SATADOM Drives (so I avoid the chicken-and-egg-problem)
TrueNAS then mounts 2 Pools:
1. a HDD-Pool for Media Storage and Backup
2. a SSD-NVME-Pool on which other VMs can be stored

The HDD-Pool (on a SAS-Expander-Backplane) will be on a dedicated HBA card, with passthrough to the FREENAS

The SSD-Pool (which are NVME-SSDs directly in the M.2-Slots of the Mainboard) will also be presented to the TrueNAS by passthrough

On the SSD-Pool, I'll create a zvol which is then presented back to ESXi via iSCSI
On that iSCSI Drive, all other VMs will be stored

Any errors so far?

If yes - please tell me, if no - there would be some minor questions:

1. Do I have to use the whole SSD-Pool as a zvol for iSCSI or can I only use a (bigger) part of it and reserve a smaller part as a "normal" ZFS partition (e.g. for some jails or so)

2. Do I need (or is it any good) to create a dedicated zvol for each VM (e.g. for backup/snapshot purpose) or just the whole Space and let ESXi split it between the VMs?

3. Should I go with 2x2 SSDs (mirrored) in the 4 NVME-Slots or would it be better to go for 3 SSDs in RAIDz1 plus another smaller (optane-)SSD in the 4th NVME-Slot for a ZIL/slog (as I understood, iSCSI benefits from a zil/slog?)
I've also read that a SLOG should be mirrored - so should I even just go for 1 mirrored SSD for the VMs and 1 mirrored Optane for SLOG. And if I ever need more space for the VMs I just get a m.2-PCIe-Extender-Card?

Thanks for your help!

HoneyBadger · May 2, 2021

Welcome.

Hypopheralcus said:
TrueNAS will be booted first by ESXi and will be stored on TWO mirrored and dedicated 32GB SATADOM Drives (so I avoid the chicken-and-egg-problem)

You still have a small manner of chicken-and-egg here: in order to define a VM and all of the passthrough devices you'll need a place to save that configuration file (.VMX) and that can't/shouldn't be the USB you're using for boot. You could create two datastores out of those SATADOMs and provision a pair of VMDKs on them - but this could just as easily be a pair of smaller, regular SSDs.

Hypopheralcus said:
1. Do I have to use the whole SSD-Pool as a zvol for iSCSI or can I only use a (bigger) part of it and reserve a smaller part as a "normal" ZFS partition (e.g. for some jails or so)

You can (and I'd argue "should") provision a smaller piece of the pool as an iSCSI ZVOL, preferably a sparse one, and then see what space remains. Note that random small block I/O such as virtual machines runs best when given lots of free space in the pool; but this is mitigated quite a bit when using SSDs.

Hypopheralcus said:
2. Do I need (or is it any good) to create a dedicated zvol for each VM (e.g. for backup/snapshot purpose) or just the whole Space and let ESXi split it between the VMs?

Single ZVOL per VM is a KVM style behavior. For ESXi it's more common to make a datastore and put multiple VMs on it. Don't make it the full size of your pool though. Start with a reasonable size, and you can expand the ZVOL from there afterwards if desired.

Hypopheralcus said:
3. Should I go with 2x2 SSDs (mirrored) in the 4 NVME-Slots or would it be better to go for 3 SSDs in RAIDz1 plus another smaller (optane-)SSD in the 4th NVME-Slot for a ZIL/slog (as I understood, iSCSI benefits from a zil/slog?)
I've also read that a SLOG should be mirrored - so should I even just go for 1 mirrored SSD for the VMs and 1 mirrored Optane for SLOG. And if I ever need more space for the VMs I just get a m.2-PCIe-Extender-Card?

The SLOG question is really about "can I afford to lose unwritten data?" Check the resource here from @jgreco about why SLOG/sync writes and ESXi have a fairly tight relationship:

Resource - Sync writes, or: Why is my ESXi NFS so slow, and why is iSCSI faster?

This post is not specific to ESXi, however, ESXi users typically experience more trouble with this topic due to the way VMware works. When an application on a client machine wishes to write something to storage, it issues a write request. On a...

www.truenas.com

(Apologies if I was brief or omitted anything, I'm phone-bound at the moment.)

Hypopheralcus · May 2, 2021

Thanks a lot for your prompt reply!

HoneyBadger said:
You still have a small manner of chicken-and-egg here: in order to define a VM and all of the passthrough devices you'll need a place to save that configuration file (.VMX) and that can't/shouldn't be the USB you're using for boot. You could create two datastores out of those SATADOMs and provision a pair of VMDKs on them - but this could just as easily be a pair of smaller, regular SSDs.

Thanks - I was not aware about that additional "layer".
As I want the setup as "clean" as possible - is there any disadvantage in creating 2 datastores on the SATADOMs - I don't want to get another single-purpose-disk for that if possible - how much space should I allocate
And as you said "a pair of" - is there any way to mirror that "vmx-store" in ESXI or would I need a hardware raid for that?

HoneyBadger said:
You can (and I'd argue "should") provision a smaller piece of the pool as an iSCSI ZVOL, preferably a sparse one, and then see what space remains. Note that random small block I/O such as virtual machines runs best when given lots of free space in the pool; but this is mitigated quite a bit when using SSDs.

I don't fully get that point - if iSCSI profits of more free space - shouldn't i allocate as much space as possible to it, so there is a good amount of free space? Or does unallocated space in the same pool count as "free space"?

HoneyBadger said:
The SLOG question is really about "can I afford to lose unwritten data?" Check the resource here from @jgreco about why SLOG/sync writes and ESXi have a fairly tight relationship:

So I understood from the linked posting, that especially on VMs and iSCSI with sync=always, even on a SSD-Pool, I greatly benefit from a SLOG device. But then, I suppose, that means I should also mirror that device for failsafe-ability? So I should go for mirrored SSD in 2xM.2 NVME and another 2 smaller Optane (like 32 GB) in the other 2 M.2?
I'll lose 50 % of capacity (with which I could live with - as it's going to be 2TB-SSDs anyway) compared to 3xSSD in RaidZ1 but gain a lot of failsafe and speed - right?

HoneyBadger · May 3, 2021

Hypopheralcus said:
Thanks - I was not aware about that additional "layer".
As I want the setup as "clean" as possible - is there any disadvantage in creating 2 datastores on the SATADOMs - I don't want to get another single-purpose-disk for that if possible - how much space should I allocate
And as you said "a pair of" - is there any way to mirror that "vmx-store" in ESXI or would I need a hardware raid for that?

I would actually suggest that you get a pair of SATA SSDs instead of the SATADOMs - a used pair of 80GB Intel DC S3500s is likely cheaper than a pair of new 32GB SATADOMs.

For the mirroring, you'd create a datastore on each, then create two 16GB VMDKs (one on each datastore) and then during the TrueNAS installation process, simply select both virtual disks. ZFS will handle the mirroring of the boot devices there.

And yes, you'd need a hardware RAID (on a supported controller - softRAID doesn't work well with ESXi) to do the mirroring at the VMFS level.

Hypopheralcus said:
I don't fully get that point - if iSCSI profits of more free space - shouldn't i allocate as much space as possible to it, so there is a good amount of free space? Or does unallocated space in the same pool count as "free space"?

The "free space" for performance needs to exist at the pool level. If you provision the entire pool space (or a large chunk of it) for the iSCSI ZVOL then you won't have room for other data eg: jails in the same pool. But generally, cutting the space into multiple LUNs allows for better per-device queuing under heavy workloads. Start with a LUN that's perhaps 50% of your total pool space, and determine what the actual used space looks like once compression does its magic.

Hypopheralcus said:
So I understood from the linked posting, that especially on VMs and iSCSI with sync=always, even on a SSD-Pool, I greatly benefit from a SLOG device. But then, I suppose, that means I should also mirror that device for failsafe-ability? So I should go for mirrored SSD in 2xM.2 NVME and another 2 smaller Optane (like 32 GB) in the other 2 M.2?
I'll lose 50 % of capacity (with which I could live with - as it's going to be 2TB-SSDs anyway) compared to 3xSSD in RaidZ1 but gain a lot of failsafe and speed - right?

Mirrored SLOG is required to close the gap of "what happens if your SLOG dies at the moment of a power failure" - for most home users, a single SLOG is sufficient protection to cover most cases, and they're willing to run the risk of that edge-case causing them to roll back to a snapshot/backup copy. For businesses, they'll want to err on the side of caution. In your shoes, I'd look at "single SLOG" with periodic snapshots of the most important data from my SSD pool replicating to the HDD pool.

For the pool itself though, mirrors are strongly preferred for iSCSI, even on SSDs, due to less-than-optimal space efficiency with small records on RAIDZ. If you're keeping a narrow vdev width (2+1) you'll probably be able to get away with it though. I've been meaning to try testing this out at scale to show the impact of a stripe width on space efficiency, but life keeps getting in the way.

Patrick M. Hausen · May 3, 2021

HoneyBadger said:
For the mirroring, you'd create a datastore on each, then create two 16GB VMDKs (one on each datastore) and then during the TrueNAS installation process, simply select both virtual disks. ZFS will handle the mirroring of the boot devices there.

I haven't thought of that! Thanks! Clever idea.

Important Announcement for the TrueNAS Community.

Correct drive options for ESXI+TrueNAS

Hypopheralcus

Cadet

HoneyBadger

actually does care

Resource - Sync writes, or: Why is my ESXi NFS so slow, and why is iSCSI faster?

Hypopheralcus

Cadet

HoneyBadger

actually does care

Patrick M. Hausen

Hall of Famer

Similar threads

Important Announcement for the TrueNAS Community.

Correct drive options for ESXI+TrueNAS

Hypopheralcus

Cadet

HoneyBadger

actually does care

Resource - Sync writes, or: Why is my ESXi NFS so slow, and why is iSCSI faster?

Hypopheralcus

Cadet

HoneyBadger

actually does care

Patrick M. Hausen

Hall of Famer

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Correct drive options for ESXI+TrueNAS"

Similar threads