Trying to clarify best config for using ZFS box for VMs

Status
Not open for further replies.

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
Upcoming config (Waiting on the last of the parts in the mail this week)
  • Some intel gigabit mic add-in. I plan on upgrading to 10GBe (with MPIO) at some point in the near future, but I think I'll have other bottlenecks to tackle first.
  • 32GB ram (not certain if registered or unbuffered until it arrives, will be buying ram upgrades first)
  • 12 x 4TB WD Red drives

I expect to monkey around with things for a couple weeks before I start actively using it, but mooching off of clarification and some additional details would be keen.

  • I was originally going to do RaidZ2 with 6 drives and double it, and plan on adding another 6 within about two months. Is this comparable to triple mirror configuration in terms of performance once the third vdev has been added? Or would I be better off just starting with triple mirroring, dropping down 8TB of usable initially, and just add triple mirrors as I go?
  • It looks like a SLOG is pretty much necessary when hosting virtual disks on ZFS, so I'll be grabbing two S7300s and popping them together in a raid1 for that. Should I also consider an L2ARC sooner rather than later?
  • I couldn't find clarification as to if ZFS ram requirements are based on use space, or total space available. If total space, ram definitely needs to be top priority, but otherwise I'll only be using approximately 2 TB of easily disposable data until I'm happy with performance and output.
  • I see a lot of reference to iSCSI requiring leaving 50% of the total zpool free for optimal performance, so this makes me want to consider NFS. Is there other ways around the limitation?
  • Is there any appreciable benefit in performance worth considering setting up NFS shares on the ZFS filesystem that the VMs mount, instead of simply making a second virtual disk?

And then just a couple quick questions I can easily find out/test and are more out of ignorance:

  • Are zvols expandable? Any good reading or link so I can learn best applications of zvols?
  • How granular can snapshots get? Can I snapshot some vdevs, but not others? Or is it better to do it on the zvol?
Thanks a bunch in advance for my nebbish questions. Attempting to piggyback on the noob guide for what I need to fill in for my particular config/use case.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
So triple mirror all the way, then? Any expansion being another triple mirror then added to the pool?
 

zambanini

Patron
Joined
Sep 11, 2013
Messages
479
tell us more about the expected workload.

otherwise at least I can only guess.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
Homelab use. Maybe around 20 VMs, up and down as things get built and destroyed. This includes a windows server or two, but otherwise exclusively linux distros. Probably ESXi as the hypervisor. Good chance of doing pled media streaming, but that would also be running on a different box. The storage would purely be storage so I wouldn't be running any jails or the like along with zfs.

Overall a lot of random IO (Due to OS) but the VMs would only be serving a maximum of 5 devices at any one time with only 2 regular users, in reference to server load itself.
 

kspare

Guru
Joined
Feb 19, 2015
Messages
508
64gb of ram at a minimum. don't bother with the s3700 pci ssds, get the new intel 750. much cheaper and faster.

Don't bother with raidz if you want it to work properly use mirrored vdev.

I have lots of ups power and a generator so I haven't bothered with an slog but the intel 750 is making our data fly as a l2cache.

Be careful of your cpus, I've ran into some bottle necks there with mine.

Don't use dedup, use the recommended compression.

Use intel nics ( which I think you are)

What HBA are you using?

What hyper visor are you using on your servers?
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
64gb of ram at a minimum. don't bother with the s3700 pci ssds, get the new intel 750. much cheaper and faster.

I do intend on bumping up the ram first, and 64gb is exactly where it would be once I identify if I need to gut the existing ram or if I can just add on to it. I can go all the way up to 128GB, but I was curious as to if ZFS' memory consumption was based on used space or if it was based on total available space on the drives? That'd help determine how high I should shoot for.

I was looking at the S3700 for their "Enhanced Power Loss Protection". Do the 750s have equivalent life? In my case, I do have redundant PSU, but I do not have a strong enough UPS available yet, and definitely no generator available as the house is a rental.

Don't bother with raidz if you want it to work properly use mirrored vdev.
So triple mirror is the way to go, no contest? As I expand it should I be looking at keeping it all in one zpool or would it be wiser to split?

Be careful of your cpus, I've ran into some bottle necks there with mine.
I've got dual E5620s so I assume that covers for cpu. They happened to come with the super micro case and I don't really need to pillage one for anywhere else.

Don't use dedup, use the recommended compression.
Compression better because of performance or because of use case?

Use intel nics ( which I think you are)

What HBA are you using?
If there's a best recommended intel nic with FreeNAS/BSD, I'm game to pick two up (one for both ends). I haven't looked into which NIC model is on the motherboard, but the ones currently available are just single gigabit. HBA is m1015 in IT mode, I believe that's currently the most recommended from what I saw.

What hyper visor are you using on your servers?
Most likely ESXi, so I'll need to ensure I set sync=always, and another recommendation or two I recall reading here in these forums. Does that affect the consideration of iSCSI vs NFS? I have plans for a KVM hypervisor coming in at the end of the year but that will be using local storage only. Trying to avoid Hyper-V because it's a lab environment and I know Hyper-V quite well, and so want to ensure I get good experience with others.

Thanks again for your input.
 

KTrain

Dabbler
Joined
Dec 29, 2013
Messages
36
I do intend on bumping up the ram first, and 64gb is exactly where it would be once I identify if I need to gut the existing ram or if I can just add on to it. I can go all the way up to 128GB, but I was curious as to if ZFS' memory consumption was based on used space or if it was based on total available space on the drives? That'd help determine how high I should shoot for.

I was looking at the S3700 for their "Enhanced Power Loss Protection". Do the 750s have equivalent life? In my case, I do have redundant PSU, but I do not have a strong enough UPS available yet, and definitely no generator available as the house is a rental.


So triple mirror is the way to go, no contest? As I expand it should I be looking at keeping it all in one zpool or would it be wiser to split?


I've got dual E5620s so I assume that covers for cpu. They happened to come with the super micro case and I don't really need to pillage one for anywhere else.


Compression better because of performance or because of use case?


If there's a best recommended intel nic with FreeNAS/BSD, I'm game to pick two up (one for both ends). I haven't looked into which NIC model is on the motherboard, but the ones currently available are just single gigabit. HBA is m1015 in IT mode, I believe that's currently the most recommended from what I saw.


Most likely ESXi, so I'll need to ensure I set sync=always, and another recommendation or two I recall reading here in these forums. Does that affect the consideration of iSCSI vs NFS? I have plans for a KVM hypervisor coming in at the end of the year but that will be using local storage only. Trying to avoid Hyper-V because it's a lab environment and I know Hyper-V quite well, and so want to ensure I get good experience with others.

Thanks again for your input.

There's a a fair amount of research to be done on the iSCSI vs. NFS bit so I'd suggest looking at some of the other posts on the forum to learn about it.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
There's a a fair amount of research to be done on the iSCSI vs. NFS bit so I'd suggest looking at some of the other posts on the forum to learn about it.
That's actually where a bunch of the questions I have come from. Many older posts reference more headaches getting iSCSI to work than NFS, but I also have been seeing more recent suggestions citing to go with iSCSI in the VM scenario, so I was hoping to get clarification, or if I should still be sticking with NFS across the board still.

The most useful, recent, thread I've found is here: https://forums.freenas.org/index.php?threads/esxi-nfs-datastore-what-are-best-practices.26947/

Good argument for iSCSI (Even if it means adding a SLOG) is quoted below:
Also, NFS is a *terrible* choice for a datastore. If performance is your concern NFS shouldn't even be considered. NFS alone *significantly* increases your hardware requirements.

There's Bug 1531 lovingly full of info thanks to jgreco which shows NFS and iSCSI both have performance issues and simply throwing max ram isn't necessarily the right way to do it. So part of my concern regarding ram usage (in order to gauge scaling) is brought up well here: http://www.zfsbuild.com/2012/03/05/when-is-enough-memory-too-much-part-2/ and I'm aware that filesystem can affect the outcome. Plus iSCSI eliminates benefit of using ZFS for granular snapshots and restores, unless I use NFS for fileshares, perhaps?

I also see comments like "It is interesting to note that if the need for a datastore with faster reads is greater then NFS is the way to go but if faster writes are needed then iSCSI is the way to go." which implies a bit more disagreement with the structure and format, and even posts here have back and forth.

Perhaps it's something I'll just need to test and tinker, but I'd rather not have to migrate data off and recreate all the links if I can avoid it.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I'm sorry, but bug 1531 is out of date and would have to be retested. There are many aspects of ZFS that have changed since that ticket was generated, so unless someone is going to go back and re-perform these tests, i'm not sure how much value is added by reading that bug report. 3 years is a long time (about 1/3 of the total lifespan that ZFS has existed).

The statement you made in bold is basically not possible. Things are not that simple with ZFS, and aren't going to be that simple. There are too many things going on with ZFS to do a true breakdown to that kind of simplicity while having the statement actually be accurate for even a majority of situations.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
I had a feeling that was the case, so I was trying to narrow down on details, see what was relevant still and what may not be. Is the 50% free space still a requirement for iSCSI over ZFS? Is there any benefit for having file shares (pictures, movies, etc) mapped as NFS vs just making it part of the ESXi datastore in terms of performance, or will the split types be more hassle/bigger headache and just leave everything iSCSI?

A ZIL may not be useful at all if everything runs across iSCSI, from what I'm finding, as it gets used only below certain sizes (potentially outdated info), and an L2ARC may not show much real world performance gains unless you start dealing with large transactional databases?

Maybe it would be simpler to just say "Here is my hardware, here is the intended use case, do I need to make any changes to get decent performance out of it?"
 

KTrain

Dabbler
Joined
Dec 29, 2013
Messages
36
I had a feeling that was the case, so I was trying to narrow down on details, see what was relevant still and what may not be. Is the 50% free space still a requirement for iSCSI over ZFS? Is there any benefit for having file shares (pictures, movies, etc) mapped as NFS vs just making it part of the ESXi datastore in terms of performance, or will the split types be more hassle/bigger headache and just leave everything iSCSI?

A ZIL may not be useful at all if everything runs across iSCSI, from what I'm finding, as it gets used only below certain sizes (potentially outdated info), and an L2ARC may not show much real world performance gains unless you start dealing with large transactional databases?

Maybe it would be simpler to just say "Here is my hardware, here is the intended use case, do I need to make any changes to get decent performance out of it?"

I've never heard anything about a 50% free space requirement but I haven't been extremely active in the last 12 months so perhaps that's something I missed.

For static storage I'd consider using CIFS instead of NFS/iSCSI storage. By using the latter you're just inducing more overhead and pushing more traffic across the Storage Fabric (though this would depend on how you're leveraging different network interfaces). I'm not an expert but I feel like the WD Red drives aren't as conducive to a VMware storage platform due to their platter speed and bus (SATA vs. SAS) so efficiency will be important.

I'm in the process of tuning a VMware storage deployment myself so I can't offer tried-and-true advice for a bulletproof configuration. That said, I would give yourself plenty of time to performance test the storage and configuration. Based on what I've seen and read it's common for folks to buy the "right hardware" and still jack the configuration to the point where it runs poorly. Trial and error!
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
I didn't see a mention of the 50% free requirement for any recent posts, so it could be fixed now as well. I'm not familiar with CIFS, but if it helps I'm game to learn it. This is mostly going to be for home lab use, so I need "good enough for 3-4 users and a bunch of VMs" and not necessarily "I got 50 people hitting my server for authentication, dns, and two transactional databases, why are my servers so slow on this single Gigabit connection between my FreeNAS and the compute server?!".

I do plan on doing a fair bit of trial and error, but figure it helps to be on the right track and make sure I'm not missing any hardware. :)
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Maybe it would be simpler to just say "Here is my hardware, here is the intended use case, do I need to make any changes to get decent performance out of it?"

Probably, let's have it. ;)

50% free space discussion

This is mostly a "for your own safety" thing to avoid performance degradation due to ZFS fragmentation. The more free space you have, the more likely you'll have a contiguous block for ZFS to do its copy-on-write into.

As far as CIFS/NFS/iSCSI, you can create a single large giant pool (mirror vdevs are OK, triple isn't necessary) and cut multiple zvols for your iSCSI LUNs to use with VMware. Make a dataset and share it as CIFS or NFS, but not both at once.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
Here is the hardware I have currently:

Case:Supermicro 4U CSE-847
Motherboard: X8DTN+
Processors: 2x E5620
Memory: 32GB DDR3 PC3-10600R 1333MHz (ECC, of course)
HBA: A single M1015 (not yet flashed to IT mode)
Harddrives: 12x 4TB WD Red

Use case: Serving as a SAN for VMs all running on a separate server. The hardware above would be direct attach using both gigabit NICs on the motherboard in an MPIO configuration. Only the management IPMI would be accessible on the lan.

I intend on upgrading the ram, though I don't know if I should shoot it up to only 72gb, or if I should replace the 4g sticks with 8/16 and shoot for the moon. I also don't have any storage purchased yet for a SLOG, but can certainly pick one up. It might be something to play with still.

My goal is to saturate the ethernet, which should give a theoretical 200MB/s (or 100MB/s each way), and have some room to grow when I throw in a 10GBe card (Maybe the Intel X540-T1?) on both ends.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Good hardware. Make sure you flash that M1015 though.

In my opinion, 72GB of RAM is fine if the cost of getting 8GB/16GB sticks is prohibitive (which it probably is) - at 72GB you've got enough overhead for an L2ARC device as well to help catch whatever slips through the primary ARC.

The current hotness for SLOG devices is the Intel P3700 or Intel 750 PCIe NVMe SSDs. Haven't used one myself but @kspare may have some comments to add here, you'll need this if you want sync=always for maximum safety.

1Gbps is full-duplex, so in theory you could saturate both directions for 200MB/s reads and 200MB/s writes simultaneously; however, you'll likely only ever see that if you're writing sequentially and reading entirely from ARC. Random writes will be lower-performing.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
Good hardware. Make sure you flash that M1015 though.

In my opinion, 72GB of RAM is fine if the cost of getting 8GB/16GB sticks is prohibitive (which it probably is) - at 72GB you've got enough overhead for an L2ARC device as well to help catch whatever slips through the primary ARC.

The current hotness for SLOG devices is the Intel P3700 or Intel 750 PCIe NVMe SSDs. Haven't used one myself but @kspare may have some comments to add here, you'll need this if you want sync=always for maximum safety.

1Gbps is full-duplex, so in theory you could saturate both directions for 200MB/s reads and 200MB/s writes simultaneously; however, you'll likely only ever see that if you're writing sequentially and reading entirely from ARC. Random writes will be lower-performing.

Yeah, hardware flashing will be done tonight or tomorrow, and then I mount and attach all the drives, and begin the grand experiment.

For the ram, I can certainly throw more money at it if needed, but I assume I'd hit limits with the network before I need to bump the ram up to 128gb+. I'll do some playing with it first, since the ram won't be purchased until the middle of next month. It's easy enough to upgrade without having to uproot the configuration of zfs (which is the main reason for all the questions).

I was originally looking at the P3700, but the 750 seems more in a good price range, so it'll get picked up in the next two months.

As for the 1GBps, is that 200MB/s rw for a single port or for both using MPIO? I was seeing numbers citing the limitation of 200MB/s using dual gigabit multi path, but haven't had the hardware setup for a nice test run yet.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
maintaining free space on the pool is a method of reducing the fragmentation and increasing performance, especially when allocating blocks for write. typically, performance is severely impacted if you fill a pool beyond the 60% point, but the pain point may come before that. I have posted dozens or hundreds of times about this topic.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
I didn't see a mention of the 50% free requirement for any recent posts, so it could be fixed now as well.

By the way, you say this as though you think it is a bug. It isn't. It's an inherent issue with CoW filesystems in general. New blocks are allocated from the free pool and what might originally have been contiguous regions of disk become severely fragmented over time as writes occur. Fragmentation is reduced (never eliminated) by large amounts of free space, and ARC helps reduce the impact as well.
 

Tancients

Dabbler
Joined
May 25, 2015
Messages
23
I meant less of it being a bug and more if there was a functional solution for it. I've got a few ZOL configs at work but they haven't been used for active storage, so I've never really been concerned about functional performance. Though if I'm going with raid 10 (instead of three-way mirror) I'll have enough space for a while so I shouldn't have to worry about it, and it'd be easy to expand the zpool up to 24 drives. Out of curiosity, is there a recommended way of counteracting or recovering it? Or is the only way to basically migrate the data off, rebuild the array, and then move it back?
 
Status
Not open for further replies.
Top