SOLVED High-End virtualized TrueNAS on VMware

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
Hi,

I would like to build a high-end single node hyperconverged appliance with VMware and TrueNAS Core.
This appliance will be installed in a secondary datacenter, so that I can host about 100 VMs, if the main datacenter is destroyed.
It will be used only in case of disaster, so I can live without high-availability to cut the cost.

This build will cost a lot of money, so I will really appreciate any comments about the hardware to avoid buying the wrong devices.

Planned Software:
  • VMware Essentials 7 (cheapest paid version to enable TrueNAS VMware snapshot)
  • TrueNAS Core 12.u3
  • NFSv3 datastore (slower but easier to manage than iSCSI).
  • Replication with ZFS or Veeam Backup & Replication (not sure yet, if I replicate the VMs or the Veeam Backup files).
Planned Hardware:
  • SuperMicro Chassis 2U with 24 + 2 disk bays.
  • SuperMicro X11DPi-NT + 2x Intel Xeon Gold 6226R (2x16c@2.9GHz) + 1024 GB DDR4 ECC
  • For VMware
    • Intel SSD NVMe 7600p, 512 GB
    • Network: onboard Intel X722 (Dual port 10G-BaseT)
  • For TrueNAS
    • HBA: Broadcom HBA 9300-8i for TrueNAS (PCIe Passthrough)
    • Data disks: 26 x 4TB SATA disks (Intel D3-S4610)
The total cost is slightly under 50KEur.

If the virtualized TrueNAS is not stable enough, I will repurpose the HCI node as a compute-only node, and I will put the SSDs in another SuperMicro chassis to have a classical bare-metal TrueNAS server + VMware (so the solution will take 4U instead of 2U).

Regards,
---
 
Last edited:

kspare

Guru
Joined
Feb 19, 2015
Messages
508
1. use freenas and not true nas. We do this exact same thing you are looking to do. I can not get the performance out of TN that I can out of FN.
2. Ram is awesome, I have been running 256gb in all my boxes but upgrading to 1gb, so can’t comment on the improvements yet.
3. look for Chelsios nics instead of intel.
4. I can’t comment on the all ssd and nfs, I have not gone down this path. My new build is running 24 4tb iron wolf, 12 800gb s3610 for l2arc and 2 pci slog drives.

you are on the right path! Dont think nfs is slower than iscsi, it’s not and has a lot of advantages over iscsi. But that’s another topic.
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
1. use freenas and not true nas. We do this exact same thing you are looking to do. I can not get the performance out of TN that I can out of FN.
2. Ram is awesome, I have been running 256gb in all my boxes but upgrading to 1gb, so can’t comment on the improvements yet.
3. look for Chelsios nics instead of intel.
4. I can’t comment on the all ssd and nfs, I have not gone down this path. My new build is running 24 4tb iron wolf, 12 800gb s3610 for l2arc and 2 pci slog drives.

you are on the right path! Dont think nfs is slower than iscsi, it’s not and has a lot of advantages over iscsi. But that’s another topic.

Thank you for the comments. I can give more details
  1. I already use TrueNAS Core on several servers (with spinning disks) and I do not have realized that there may be a difference of performance with FreeNAS. Nevertheless, FreeNAS is not developed anymore, so basically I can use only TrueNAS Core or TrueNAS Scale (which is still on beta).
  2. On the main VMware cluster, I already use 600 GB of RAM, so I have picked 1 TB of RAM (16x64 GB) to have room for expansion.
  3. You are right, the Chelsio NIC worked better than Intel NIC on FreeNAS 11, especially for VNET, but since TrueNAS 12, I have encountered no issue with Intel x710 NIC, so I will say that now they works both very well. Anyway, my SuperMicro reseller proposes only Intel NIC.
  4. I do not know if I need an Optane SLOG for an all-flash-array

@kspare
Are you using a virtualized FreeNAS or a bare-metal FreeNAS on your new build?
 

kspare

Guru
Joined
Feb 19, 2015
Messages
508
I'm running bare metal freenas.

I would suggest doing some of your own testing on FN vs TN if you are expecting high performance.
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
If it can help someone

By default, VMware configures FreeBSD VM to boot with BIOS, but KB 2142307 says that UEFI if preferable.
To use more than 3.75GB of total BAR allocation within a virtual machine add this line to the virtual machine's vmx file to set the virtual machine BIOS to use UEFI

This second article about GPU passthrough says the same thing:
 

olddog9

Dabbler
Joined
Apr 23, 2021
Messages
28
blanchet,

Thanks for the Vmware tip KB 2142307 ! That is super handy to know about UEFI, MMIO, and the limits of each ESXi version.

olddog9
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
For information:

I have finally got the hardware.
When booting the virtualized TrueNAS Core in BIOS mode, the system is unstable (the VM hangs after few days)
but after booting the virtualized TrueNAS Core in UEFI mode, as KB 2142307 suggests, the system is very stable (I have already 1 month of uptime without issues)

For the ZFS layout, I have tested a strip of mirrors vs a strip of 6-wide RAIDz2 vdevs with fio
I am puzzled because the two layouts have very close performances. Indeed, I hoped that the strip of mirrors would be really faster.
Because IOPS is not a bottleneck, then I have chosen the strip of 6-wide RAIDz2 vdevs.
(The two SSD on the back are connected to the HBA, so I can use them as hot-spares).

Finally I have a very compact system that hosts 100 VMs and that probably can host twice more.
 

blanchet

Guru
Joined
Apr 17, 2018
Messages
516
RAIDZ2 hosting a hundred VM's?

I hope you've checked out

https://www.truenas.com/community/threads/the-path-to-success-for-block-storage.81165/

and made sure to be understand the downsides of the RAIDZ for block storage. It's not that it could not ever work, but it's not expected to work well.

Yes, I have carefully checked the guide, but I think that, in my specific case, it is still worth running with Raidz2 vdevs, because there is no risk.
The server is just a disaster recovery node, that can tolerate down times. If performances become poor, I will revert back to a strip of mirrors to get more IOPS.
 
Top