TrueNas core on ESXi

JazzIT

Cadet
Joined
Sep 9, 2021
Messages
2
Hi,

Newbie question and if this has been answered already please just tell me where to go :)

I am testing TrueNas for potential as small office NAS, primarily file sharing. We run several Virtual machines on Vmware/Esxi. All the Esxi machines have multiple HD's with minuimum of Raid 1 at a hardware level. Datastores have been created in ESXi on the hardware radied drives. So when allocating disk space to Vm's we only datastores

I have installed TrueNas on a 50Gb HD from esxi datastore 1 and have allocated the TrueNas VM a second HD 1TB from a different esxi datastore

When creating a New Pool in TrueNas the 1TB drive is visible as da1 but as this is the only available drive I get the warning:

"A stripe data vdev is highly discouraged as will result in data loss if it fails"

TrueNas is rightly insistent that this is not a good idea.

"Warning"

" The current pool layout is not recommended. ....."

I understand that in a bare metal install this is not a good idea, but as I am running under Esxi with hardware Raided drives already, is this going to be a problem. If a drive fails then we replace at the esxi level.

I can allocate the TrueNas VM a second (or third) 1TB HD but I am not sure that rading at the TrueNas level to datastores that are already raided at the Esxi level is such a good idea as what do we replace is someting fails.

Or am i missing something.

Thanks in advance.

P.s .I have looked at the passthrough option but we have multiple VM's on Esxi and a datastore may contain muliple vm's and/or a vm may be spread over multiple datastores.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
This was all explained long ago. Do head on over to

https://www.truenas.com/community/t...ative-for-those-seeking-virtualization.26095/

Your main issue with not providing redundancy for ZFS is that you are giving ZFS no way to recover if there is a problem. Your RAID array does not guarantee to correctly store and retrieve data, so it is entirely possible for a block to silently corrupt on disk, the RAID5-protected RAID controller sends it up to ESXi which sends it up to the VM, the ZFS on the VM spots the checksum error, goes "well hot damn now whaddaido" and returns you a nice zero-filled block. That is merely annoying if it is file data contents, but it can be disastrous if it is metadata. ZFS has no way to "fsck" or "chkdsk" a pool, and it depends on its own error recovery capabilities. In a bad case, this can result in loss of pool.

We've had a low level but relatively constant stream of people who think that the storage controller HBA selection and admonition to avoid using hardware RAID is some sort of suggestion, and that their super duper RAID6 awesome setup "should" work fine the way they think, and that they should be able to do a straight single disk setup. Invariably some corruption or controller issue comes along and bites them in the butt, and the only good answer we have is to move all the content off the pool, rip out the RAID, and put in the recommended HBA, and build a proper ZFS pool.

This led me to write up this article about using HBA's instead of RAID controllers.

Now here's the caveat, and I want you to clearly understand what I am saying. From FreeNAS's point of view, ESXi virtual disks are on an mpt-based HBA. The admonition about not using a RAID controller is largely a driver compatibility thing, and from that point of view, we expect that ESXi virtual disks are approximately as reliable as a true physical disk. It does not matter if the underlying ESXi datastore is provided by NVMe, RAID5, SAN, NFS, iSCSI, or any of that. You may freely use ESXi mpt virtual disks with FreeNAS.

However, they are not guaranteed to be reliable. Your NAS is informing you that there is no redundancy. You really do need to have a mirror of two ESXi virtual disks if you want to avoid the risk of corruption/pool damage. And, yes, this means that you may be RAID1'ing together two different virtual disks that are "already" protected via RAID5. That's a consequence of having a poorly designed storage architecture where you are layering error detection ON TOP of RAID5, rather than having ZFS's RAIDZ which integrates the error detection along with the RAID drive management.
 

JazzIT

Cadet
Joined
Sep 9, 2021
Messages
2
Wow, Thanks for the promt reply and the link. Very very useful.

Just a point on our current set up we dont have raid 5 we run raid 1 (mirror) with hot spare.

So, if I understand and give we do not have baremetal machines only ESXi space to work with, we *should* be mirroring two ESXi virtual disks which are already on raid 1 (mirror) of physical disks. So in effect we need 4 TB of physical disk to give 1TB of pool strage in the NAS. ...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Wow, Thanks for the promt reply and the link. Very very useful.

Just a point on our current set up we dont have raid 5 we run raid 1 (mirror) with hot spare.

RAID1 as well. The only "dangerous" ESXi datastore in my experience is RAID0 or a standard SATA/SAS HDD; you can get some bad behaviours out of ESXi when a datastore vanishes (think: the disk fails).

So, if I understand and give we do not have baremetal machines only ESXi space to work with, we *should* be mirroring two ESXi virtual disks which are already on raid 1 (mirror) of physical disks. So in effect we need 4 TB of physical disk to give 1TB of pool strage in the NAS. ...

That's correct. That's why this isn't practical at scale, which is what my virtualization-oriented post was discussing. While you can do more heavyhanded things like turning off checksumming, at that point, ZFS isn't doing much of anything for you anymore, and there are other NASwares that run fine as VM's with less onerous CPU and memory requirements.
 
Top