File storage performance in VMDK vs iSCSI

MikeyG

Patron
Joined
Dec 8, 2017
Messages
442
I have an application that will be storing up to 4TB of files. The files should be less than 10MB each. The software I’m using will be running on a VM and can’t use SMB for file storage (needs a local drive.) Most access will be random.

Is there a difference, as it pertains specifically to zfs, between storing files in a VMDK file vs an iscsi volume attached to the client? How does it affect performance or storage efficiency?

I remember previously being told that zfs can’t see iscsi files the same way it can see file on a normal dataset, and therefore the effectiveness of ARC is greatly reduced. Is that true? I assume having a vmdk storing files (which itself would be accessed via iscsi) and storing directly on iscsi would be the same.

Data will not be super important, so sync writes will be off.

Pool will be RAIDZ2 on 8TB drives. VMs themselves running Ubuntu on SSDs. Furthers specs should be in my signature.

Thanks for your thoughts!
 

MikeyG

Patron
Joined
Dec 8, 2017
Messages
442
Thank @jgreco! I think that first one was the post I remember reading. I guess what I was asking was specifically the difference between iscsi on the client and a drive on a vmdk -> iscsi as it pertains to zfs. It seems like maybe there is no appreciable difference since the protocol that Freenas sees is the same? I realize that compared to smb/nas it's very different.

In regards to the mirrors - I'm not hosting the actual OS data on a RAIDZ2 pool. That is on an ssd mirror pool. If the RAIDZ2 isn't fast enough for this particular file storage project, I will probably abandon it rather than change that pool to mirrors.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
The VMDK and iSCSI will share a lot more performance similarity than ${PICK-ONE-OF-THOSE} vs file access directly on the FreeNAS, glad that's clear.

There are many factors that can impact the specifics of performance between your VMDK-on-iSCSI and client-direct-iSCSI models, so it's not likely to be exactly the same, quite possibly not even close. VMDK itself is an interesting performance puzzle, because there are so many variables, including big ones such as "thin provision" and (at least for earlier vmfs versions) block sizes, and all the iSCSI tuning things you can do. ESXi will tend to be tuned for VM performance that favors a bunch of VM's simultaneously accessing a datastore, rather than optimizing for a single VM that is exclusively accessing a datastore. Hope that makes sense.
 

MikeyG

Patron
Joined
Dec 8, 2017
Messages
442
Yes, that makes sense. Thank you!
 
Top