Hyper-V VM Storage Suggestions

Steven Sedory · Aug 14, 2015

Yall, your input is invaluable. Thank you so much for taking the time to respond to my posts.

For the record, I did recreate the pool, so dedup blocks wont be an issue.

I think we're going to go with the Intel 750 from here. Maybe I can talk to money spenders into the P3700. We'll see.

As for setting the "vfs.zfs.vdev.trim_on_init=0" tunable, it sounds like I should apply it right before I shut down the SAN to install the SLOG. Is this correct? Or at least before I add it to the pool. Thanks for the heads up on that btw.

kspare · Aug 14, 2015

Yes, otherwise it will take forever to boot up.

Are you using another one for L2Arc as well? I can give you my settings for that if you want.

Steven Sedory · Aug 15, 2015

We are not, but it's a fantastic idea. I know that rule is often "just add more RAM" for cache. That being said, is our current setup a candidate for L2Arc? I remember reading about a tool somewhere that can tell you.... Any pointers?

sfcredfox · Aug 17, 2015

Steven Sedory said:
is our current setup a candidate for L2Arc?

I think so, any disagreements from others?

This is a good post to get after:
https://forums.freenas.org/index.php?threads/arc-stats-questions-problems-thread.30339/

HoneyBadger · Aug 17, 2015

sfcredfox said:
I think so, any disagreements from others?

Absolutely a candidate, he's got the 128GB of RAM necessary to back it up. The rule of thumb is "no more than 4-5x your RAM" for L2ARC, as indexing L2ARC costs you main memory - a second Intel 750 400GB would fit into that rule nicely as well as perform very quickly. @kspare is the voice of experience there.

kspare · Aug 17, 2015

You are definitely good for L2arc, i'm running a 400gb l2 arc with 64 gb of ram and it works great! I am planning for another 64gb of ram as it will only make things better.

I attached my tunables.

When I build my next san, I would run 256gb of ram, a 400gb Intel 750nve for Slog and a 1.2TB Intel 750 for my L2Arc.

I've been checking my stats daily and my l2arc hits keep going up daily. about 1% increase per day lately. so it will be interesting to see if I hit over 50% later this week.

I should mention right now I only have 64gb of ram but it work really well!

Steven Sedory · Aug 17, 2015

Awesome. Thanks for the info guys. We're going to get two Intel 750's, one for Slog and one for L2ARC. And thank you for the tunables.

Steven Sedory · Aug 20, 2015

So we ordered three Intel 750's, all 400GB.

Other than the "do not trim" tunable, any other gotchas or perhaps procedures I should follow when setting up the mirrored Slog and the standalone L2ARC?

diehard · Aug 24, 2015

Possibly manually format/partition the SLOG to the size of the ZIL you will need, i believe the GUI still formats the entire drive by default.

wreedps · Sep 3, 2015

Subscribing

Steven Sedory · Sep 3, 2015

The Intel 750's came in. I should be setting them up in the next week or so.

depasseg · Sep 4, 2015

wreedps said:
Subscribing

You can just click the "Watch thread" button at the upper right corner of the page.

Steven Sedory · Sep 26, 2015

So, hoping to hear back from some of you experienced users before I move forward...

I have the three cards. Two are going to be mirrored for the Slog and one is going to be used for L2ARC.

Currently, the box is in production, running a bunch of VMs, and running fine.

My thought is to shut everything down, power of the SAN, install the cards, boot back up, set the trim tunable, restart, and then simply "extend" my current volume by adding the two mirrored 750's as the Slog and then the one 750 as the L2ARC. Then maybe restart for good measure, and turn of the VMs on again.

Is this a good plan? Am I missing anything? Your advice is much appreciated.

jgreco · Sep 27, 2015

diehard said:
Possibly manually format/partition the SLOG to the size of the ZIL you will need, i believe the GUI still formats the entire drive by default.

Yeah, I'm convinced that underprovisioning the SLOG devices is the way to go, simply because you're *guaranteeing* that the controller has a much larger bucket of free pages to work with. I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

Steven Sedory said:
I have the three cards. Two are going to be mirrored for the Slog and one is going to be used for L2ARC.

Any reason for the mirror? You could just keep one installed as spare. It'd hurt performance temporarily if the single SLOG failed but it gives you the option to install it as L2ARC if that were to fail. There's nothing WRONG with mirroring if that's what you're set on doing, but be aware it is slightly slower.

My thought is to shut everything down, power of the SAN, install the cards, boot back up, set the trim tunable, restart, and then simply "extend" my current volume by adding the two mirrored 750's as the Slog and then the one 750 as the L2ARC. Then maybe restart for good measure, and turn of the VMs on again.

Is this a good plan? Am I missing anything? Your advice is much appreciated.

That seems fine to me. Just be damn sure you add the SLOG and the L2ARC devices properly. Too many people end up adding them to the pool as vdevs. Read the manual and read your screen very carefully.

HoneyBadger · Sep 28, 2015

jgreco said:
I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

Guess I'm "no one" because I'm interested as hell in proving/disproving this specific use case. It's already very well understood outside of ZFS that "more spare area equals improved performance and consistency."

For SAS/SATA devices you can undersize the drive by setting the HPA (Host Protected Area) but for NVMe devices I don't believe that's an option. Maybe it's an option on the P-series Intels that isn't on the consumer ones. Edit: "There are similar commands for NVMe drives too, but for now there is no publicly available utility for issuing those commands." Well, crap.

I'll see if I can get a spare box crunching some numbers on this.

jgreco · Sep 28, 2015

I don't think it's actually necessary to set the HPA. For a new SSD, the pages are unallocated and unmapped. If you have a 120GB SSD and create a SLOG partition on the first ~10GB, for example, pages to store that are of course allocated upon use, but it leaves > 110GB of pages unallocated and unmapped. Allocating new pages to update blocks in the SLOG should be instantaneous, with the old pages thrown on the to-be-blanked pile.

I can't think of a downside to that. The current FreeNAS paradigm allocates the whole 120GB, leaving only the drive's normal reserve, which seems to me to be a very small pool compared to my strategy, and of that space, at least 100GB is a totally frickin' useless waste of space.

Now, it could actually turn out that the difference is actually so minimal as not to matter. I also haven't looked into how the TRIM support that has been added since I filed that feature request might play into this. If ZFS is TRIM'ing the transaction groups in the SLOG once they've been committed to the main pool, this probably doesn't matter to a SSD that support TRIM. Still useful on other SSD's, though, I think.

The thing that bothers me here is that this is an optimization that EITHER has no effect, OR is a win, and it is fairly simple to implement.

cyberjock · Sep 29, 2015

jgreco said:
Yeah, I'm convinced that underprovisioning the SLOG devices is the way to go, simply because you're *guaranteeing* that the controller has a much larger bucket of free pages to work with. I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

I don't think anyone's not interested. I think it's more a situation of "why care?" with today's SSDs.

Case and point- I do tech support for lots of TrueNAS servers. They are never under-provisioned. Yes, all disks fail. The SSDs that iXsystem's uses even fail. But, I've only seen maybe a 1/2 dozen or so fail in the 17 months I've worked at iXsystems. None failed because they were out of writeable space. Quite a few have been in heavy workload environments for more than 4 years. Some were slogs, some were L2ARCs.

Now, if I take the theory on why underprovisioning is better as fact, and I accept the fact that SSD lifespans seem to be so excessively long that you aren't going to care about underprovisioning anyway, then what are we gaining by adding the complexity? If we underprovision everyone to 8GB, and someone goes with 40Gb LAN (and TrueNAS users have some of those) then we could, in theory, be bottle-necking them.

I think this is a situation where the theory is sound (and for the record I do agree with the theory) but in practice there's no discernible improvement in longevity of the drive or performance, so there appears to be nothing to gain for the added complexity.

Just my 2 cents though. ;)

HoneyBadger · Sep 29, 2015

I'm actually running some tests on that front right now, but this strikes me as something that should go in another thread. Ahem.

Edit; We talkin' 'bout 'dis over here now:

https://forums.freenas.org/index.php?threads/slog-underprovisioning.38374/

jgreco · Sep 29, 2015

cyberjock said:
Now, if I take the theory on why underprovisioning is better as fact, and I accept the fact that SSD lifespans seem to be so excessively long that you aren't going to care about underprovisioning anyway, then what are we gaining by adding the complexity? If we underprovision everyone to 8GB, and someone goes with 40Gb LAN (and TrueNAS users have some of those) then we could, in theory, be bottle-necking them.

I think this is a situation where the theory is sound (and for the record I do agree with the theory) but in practice there's no discernible improvement in longevity of the drive or performance, so there appears to be nothing to gain for the added complexity.

Just my 2 cents though. ;)

I think it'd be a situation where there's probably an option to do the underprovisioning as part of the device addition. Since a SLOG device can be detached from a pool and reattached at a different size without harm, the case you put forth (40Gb LAN and a fixed 8GB size) is more or less a nonissue. A more realistic sizing strategy might be to by default underprovision the drive down to 50% or 25% of its actual capacity, kind of an "autotune" for SLOG, where you could probably even look to see what the system's total network interface capacity was, to make sure it wasn't stupidly-small.

This is admittedly less of an issue in today's TRIM-enabled era. My guess is that it would still be a win during stressy periods, because the SLOG device would have a larger pool of free pages that it could burn through very rapidly. Since a dual ten gig can theoretically max out the write capacity of a P3700, and a lot of the other SLOG options are substantially slower than that, it seems like it'd be a win.

cyberjock · Sep 29, 2015

If its any consolation, the proper way to handle this situation is to use the manufacturer's tools to underprovision the size of the drive. For example, Samsung has a tool that lets you resize their SSD lines to smaller sizes as specified by you. I know a few iXsystems customer have underprovisioned 400GB SSDs to 20GB to use as an slog. The SSD is detected as a 20GB drive, and you "fully" partition the drive (fully in quotes since we know it's got a buttload of additional space available) like you normally would from the WebGUI.

I feel like that is the better way to go. Besides, AFAIK the only company that has openly admitted to underprovisioning drives by having smaller partitions is Intel. So unless other manufacturer's make the same claim, there's no way to know what they do or don't do. In fact, Intel hasn't even openly admitted to this feature in more than 2 years, so they may not even support it anymore. But, even if its not supported anymore and by anyone, ideally there's nothing lost but nothing gained.

Important Announcement for the TrueNAS Community.

Hyper-V VM Storage Suggestions

Explorer

Guru

Explorer

Patron

actually does care

Guru

Attachments

Explorer

Explorer

Contributor

Patron

Explorer

FreeNAS Replicant

Explorer

Resident Grinch

actually does care

Resident Grinch

Inactive Account

actually does care

Resident Grinch

Inactive Account

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Hyper-V VM Storage Suggestions"

Similar threads