ESXi Storage

justgosh · Aug 28, 2018

I started by building a system to test various applications. My current configuration has evolved over the past few years, a few applications have been kept for daily use but for the most part it's a test environment.
As I've been building, some issues that I didn't understand initially have been created. I would like to work the issues out and avoid anything that it's obvious to someone with more experience. I'm hosting several VMs but have been running into throughput issues despite their low utilization. I had iSCSI performance issues so I'm currently using NFS and CIFS (occasional file management).

I'm currently running 5 WD Reds in Z1 which I will add a drive to and migrate to Z2. I've added SSD log and cache drives, however, I didn't use a PLP drive. Additionally, the SSD drives selected do not have the wear tolerance required to sustain the configuration. Finally, I'm adding a 10GbE network to free up some switch ports.

ESXi VMs
AD/DHCP/DNS
Ubuntu 16.04 Gitlab
Plex/Sonarr/DVR1
Plex/DVR2
Ubuntu 16.04 Docker app server
MacOS test 1
Windows 10 test 1
vCenter appliance
Fonality appliance

Supermicro X8SIL - Intel Xeon X3450 - KVR1333D3Q8R9S/8G 32GB ECC RAM - M1015 - 6 x 3TB WD Red - X540-T2
I also have a few Samsung 850 Pro 250GB and Sandisk Ultra 240GB that I'm not sure how to use in this setup. I have room for 2 more SATA drives on the controller.

I'm looking at adding a PLP drive. I'm looking for the best way to proceed and considering a DC P3700 or similar but thought I would ask for suggestions before I spend any money.

edit: typo on HBA model

kdragon75 · Aug 28, 2018

justgosh said:
5 WD Reds in Z1 which I will add a drive to and migrate to Z2.

Skip the raidZ2 and just do striped mirrors. VMs end up using a fair bit of random IO and you need more vdevs to better support that IO. The drive used as a SLOG is fine. The odds of needing a PLP when using a UPS configured to safely shutdown your VMs are slim to none. As for wear on the SSD, keep in mind its never using more than 1GB at time and therefore can wear level across the entire drive. Don't quote me but you may want to give iSCSI a closer look down the road as VAAI support may be limited in upcoming versions of FreeNAS.

kdragon75 · Aug 28, 2018

Also with only 32GB of RAM your likely better off without the L2ARC. You can use the SSD as vFlash read cache in your host and configure your VMs to use it.

jgreco · Aug 28, 2018

kdragon75 said:
The drive used as a SLOG is fine. The odds of needing a PLP when using a UPS configured to safely shutdown your VMs are slim to none.

If you're not going to do it right, then just skip the SLOG entirely. It just slows everything down without giving the guarantee that the SLOG exists to provide. Turn on async writes and call it a day, and if the NAS panics or loses power, just deal with the potential corruption. Your writes will be a lot faster just turning on async writes.

jgreco · Aug 28, 2018

Also the M5014 is not usable as an HBA. Substitute in an actual HBA in IT mode.

kdragon75 · Aug 28, 2018

jgreco said:
If you're not going to do it right, then just skip the SLOG entirely. It just slows everything down without giving the guarantee that the SLOG exists to provide. Turn on async writes and call it a day, and if the NAS panics or loses power, just deal with the potential corruption. Your writes will be a lot faster just turning on async writes.

Frequent snapshots can be a lifesaver here.

justgosh · Aug 28, 2018

kdragon75 said:
You can use the SSD as vFlash read cache in your host and configure your VMs to use it.

Thanks, checking it out on Yellowbricks. What happens in a power failure?

kdragon75 said:
VMs end up using a fair bit of random IO and you need more vdevs to better support that IO.

Because these VMs are mostly used for testing, the I/O is pretty low at idle https://imgur.com/a/DIocbBn When stuff reboots or updates it tends to take more time than I would prefer. I think it's coming down to bandwidth and I think the vFlash suggestion will reduce IO even more if I'm reading that correctly.

jgreco said:
Also the M5014 is not usable as an HBA. Substitute in an actual HBA in IT mode.

Nice catch on the typo. It's an M1015. I recall cross flashing to IT mode for use as an HBA.

jgreco said:
If you're not going to do it right, then just skip the SLOG entirely. It just slows everything down without giving the guarantee that the SLOG exists to provide. Turn on async writes and call it a day, and if the NAS panics or loses power, just deal with the potential corruption. Your writes will be a lot faster just turning on async writes.

This sounds like more of a headache over the next 5 years than spending $300 for a SLOG device unless I'm missing something.

kdragon75 · Aug 29, 2018

justgosh said:
What happens in a power failure?

The read cache is lost and is refilled with normal use. The VM never writes data there.

justgosh said:
Because these VMs are mostly used for testing, the I/O is pretty low at idle

If only we could design for idea spec and still have it perform.

jgreco · Aug 29, 2018

justgosh said:
Nice catch on the typo. It's an M1015. I recall cross flashing to IT mode for use as an HBA.

Fine then.

This sounds like more of a headache over the next 5 years than spending $300 for a SLOG device unless I'm missing something.

https://forums.freenas.org/index.php?threads/some-insights-into-slog-zil-with-zfs-on-freenas.13633/

The thing that the ZIL is trying to fix - the ONLY thing that the ZIL is trying to fix - is the situation where your NAS panicks or loses power. If - and only if - the hypervisor continues to run, and the VM goes into a holding state waiting on some incomplete disk I/O, here's the problem:

VM writes data to its virtual disk
NAS receives that request but doesn't actually get it committed to disk
NAS crashes, panicks, has other data-tastrophe, and comes back online having lost those "in-RAM" changes
VM now thinks some data has been written to disk but if it were to read those disk blocks would find "old" data there.

This could be a problem if the write was updating file data, filesystem metadata such as a directory, etc.

Without the ZIL, FreeNAS and ZFS will handle it just fine if you were to gracefully shut down the NAS, even with VM's running. iSCSI or NFS will flush the data to ZFS which will flush the data to disk. That's all as should be. The guest VM might freak out and might crash, but that's not a solvable problem within these subsystems.

The purpose of the ZIL is to make things also work right if there's some sort of non-graceful event (crash, power loss, etc). But if there's a power loss, you probably also lose your hypervisor, so not an issue. So the ZIL is really mostly protecting from the case where the NAS crashes and reboots, losing data in-RAM as it does so.

Now, when the NAS crashes but the hypervisor keeps running, and the VM survives the event, you now have a problem. You could potentially lose some of the writes to the VM disk. In some cases, that's a big fat shrug - a reboot and fsck can take care of it. But if you're writing things like database files, it could really suck.

So how often does your NAS crash? And how important are the things your VM's are writing to disk? These questions are useful in guiding whether or not you really need to worry about it. If your VM's are not too important and you've never had your NAS crash, is it worth $300 and a significant drop in write performance to add the SLOG?

Obviously I cannot answer that question for you.

Important Announcement for the TrueNAS Community.

ESXi Storage

justgosh

Dabbler

kdragon75

Wizard

kdragon75

Wizard

jgreco

Resident Grinch

jgreco

Resident Grinch

kdragon75

Wizard

justgosh

Dabbler

kdragon75

Wizard

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

ESXi Storage

Dabbler

Wizard

Wizard

Resident Grinch

Resident Grinch

Wizard

Dabbler

Wizard

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "ESXi Storage"

Similar threads