Hyper-V VM Storage Suggestions

Status
Not open for further replies.

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
Yall, your input is invaluable. Thank you so much for taking the time to respond to my posts.

For the record, I did recreate the pool, so dedup blocks wont be an issue.

I think we're going to go with the Intel 750 from here. Maybe I can talk to money spenders into the P3700. We'll see.

As for setting the "vfs.zfs.vdev.trim_on_init=0" tunable, it sounds like I should apply it right before I shut down the SAN to install the SLOG. Is this correct? Or at least before I add it to the pool. Thanks for the heads up on that btw.
 

kspare

Guru
Joined
Feb 19, 2015
Messages
507
Yes, otherwise it will take forever to boot up.

Are you using another one for L2Arc as well? I can give you my settings for that if you want.
 

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
We are not, but it's a fantastic idea. I know that rule is often "just add more RAM" for cache. That being said, is our current setup a candidate for L2Arc? I remember reading about a tool somewhere that can tell you.... Any pointers?
 

sfcredfox

Patron
Joined
Aug 26, 2014
Messages
340

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
I think so, any disagreements from others?

Absolutely a candidate, he's got the 128GB of RAM necessary to back it up. The rule of thumb is "no more than 4-5x your RAM" for L2ARC, as indexing L2ARC costs you main memory - a second Intel 750 400GB would fit into that rule nicely as well as perform very quickly. @kspare is the voice of experience there.
 

kspare

Guru
Joined
Feb 19, 2015
Messages
507
You are definitely good for L2arc, i'm running a 400gb l2 arc with 64 gb of ram and it works great! I am planning for another 64gb of ram as it will only make things better.

I attached my tunables.

When I build my next san, I would run 256gb of ram, a 400gb Intel 750nve for Slog and a 1.2TB Intel 750 for my L2Arc.

I've been checking my stats daily and my l2arc hits keep going up daily. about 1% increase per day lately. so it will be interesting to see if I hit over 50% later this week.

I should mention right now I only have 64gb of ram but it work really well!
 

Attachments

  • Capture.JPG
    Capture.JPG
    77.3 KB · Views: 368
  • Capture2.JPG
    Capture2.JPG
    56.2 KB · Views: 396

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
Awesome. Thanks for the info guys. We're going to get two Intel 750's, one for Slog and one for L2ARC. And thank you for the tunables.
 

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
So we ordered three Intel 750's, all 400GB.

Other than the "do not trim" tunable, any other gotchas or perhaps procedures I should follow when setting up the mirrored Slog and the standalone L2ARC?
 

diehard

Contributor
Joined
Mar 21, 2013
Messages
162
Possibly manually format/partition the SLOG to the size of the ZIL you will need, i believe the GUI still formats the entire drive by default.
 

wreedps

Patron
Joined
Jul 22, 2015
Messages
225
Subscribing
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874

Steven Sedory

Explorer
Joined
Apr 7, 2014
Messages
96
So, hoping to hear back from some of you experienced users before I move forward...

I have the three cards. Two are going to be mirrored for the Slog and one is going to be used for L2ARC.

Currently, the box is in production, running a bunch of VMs, and running fine.

My thought is to shut everything down, power of the SAN, install the cards, boot back up, set the trim tunable, restart, and then simply "extend" my current volume by adding the two mirrored 750's as the Slog and then the one 750 as the L2ARC. Then maybe restart for good measure, and turn of the VMs on again.

Is this a good plan? Am I missing anything? Your advice is much appreciated.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Possibly manually format/partition the SLOG to the size of the ZIL you will need, i believe the GUI still formats the entire drive by default.

Yeah, I'm convinced that underprovisioning the SLOG devices is the way to go, simply because you're *guaranteeing* that the controller has a much larger bucket of free pages to work with. I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

I have the three cards. Two are going to be mirrored for the Slog and one is going to be used for L2ARC.

Any reason for the mirror? You could just keep one installed as spare. It'd hurt performance temporarily if the single SLOG failed but it gives you the option to install it as L2ARC if that were to fail. There's nothing WRONG with mirroring if that's what you're set on doing, but be aware it is slightly slower.

My thought is to shut everything down, power of the SAN, install the cards, boot back up, set the trim tunable, restart, and then simply "extend" my current volume by adding the two mirrored 750's as the Slog and then the one 750 as the L2ARC. Then maybe restart for good measure, and turn of the VMs on again.

Is this a good plan? Am I missing anything? Your advice is much appreciated.

That seems fine to me. Just be damn sure you add the SLOG and the L2ARC devices properly. Too many people end up adding them to the pool as vdevs. Read the manual and read your screen very carefully.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

Guess I'm "no one" because I'm interested as hell in proving/disproving this specific use case. It's already very well understood outside of ZFS that "more spare area equals improved performance and consistency."

For SAS/SATA devices you can undersize the drive by setting the HPA (Host Protected Area) but for NVMe devices I don't believe that's an option. Maybe it's an option on the P-series Intels that isn't on the consumer ones. Edit: "There are similar commands for NVMe drives too, but for now there is no publicly available utility for issuing those commands." Well, crap.

I'll see if I can get a spare box crunching some numbers on this.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
I don't think it's actually necessary to set the HPA. For a new SSD, the pages are unallocated and unmapped. If you have a 120GB SSD and create a SLOG partition on the first ~10GB, for example, pages to store that are of course allocated upon use, but it leaves > 110GB of pages unallocated and unmapped. Allocating new pages to update blocks in the SLOG should be instantaneous, with the old pages thrown on the to-be-blanked pile.

I can't think of a downside to that. The current FreeNAS paradigm allocates the whole 120GB, leaving only the drive's normal reserve, which seems to me to be a very small pool compared to my strategy, and of that space, at least 100GB is a totally frickin' useless waste of space.

Now, it could actually turn out that the difference is actually so minimal as not to matter. I also haven't looked into how the TRIM support that has been added since I filed that feature request might play into this. If ZFS is TRIM'ing the transaction groups in the SLOG once they've been committed to the main pool, this probably doesn't matter to a SSD that support TRIM. Still useful on other SSD's, though, I think.

The thing that bothers me here is that this is an optimization that EITHER has no effect, OR is a win, and it is fairly simple to implement.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yeah, I'm convinced that underprovisioning the SLOG devices is the way to go, simply because you're *guaranteeing* that the controller has a much larger bucket of free pages to work with. I suggested this years ago

https://bugs.freenas.org/issues/2365

but no one's interested in proving or disproving the theory.

I don't think anyone's not interested. I think it's more a situation of "why care?" with today's SSDs.

Case and point- I do tech support for lots of TrueNAS servers. They are never under-provisioned. Yes, all disks fail. The SSDs that iXsystem's uses even fail. But, I've only seen maybe a 1/2 dozen or so fail in the 17 months I've worked at iXsystems. None failed because they were out of writeable space. Quite a few have been in heavy workload environments for more than 4 years. Some were slogs, some were L2ARCs.

Now, if I take the theory on why underprovisioning is better as fact, and I accept the fact that SSD lifespans seem to be so excessively long that you aren't going to care about underprovisioning anyway, then what are we gaining by adding the complexity? If we underprovision everyone to 8GB, and someone goes with 40Gb LAN (and TrueNAS users have some of those) then we could, in theory, be bottle-necking them.

I think this is a situation where the theory is sound (and for the record I do agree with the theory) but in practice there's no discernible improvement in longevity of the drive or performance, so there appears to be nothing to gain for the added complexity.

Just my 2 cents though. ;)
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,110
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,681
Now, if I take the theory on why underprovisioning is better as fact, and I accept the fact that SSD lifespans seem to be so excessively long that you aren't going to care about underprovisioning anyway, then what are we gaining by adding the complexity? If we underprovision everyone to 8GB, and someone goes with 40Gb LAN (and TrueNAS users have some of those) then we could, in theory, be bottle-necking them.

I think this is a situation where the theory is sound (and for the record I do agree with the theory) but in practice there's no discernible improvement in longevity of the drive or performance, so there appears to be nothing to gain for the added complexity.

Just my 2 cents though. ;)

I think it'd be a situation where there's probably an option to do the underprovisioning as part of the device addition. Since a SLOG device can be detached from a pool and reattached at a different size without harm, the case you put forth (40Gb LAN and a fixed 8GB size) is more or less a nonissue. A more realistic sizing strategy might be to by default underprovision the drive down to 50% or 25% of its actual capacity, kind of an "autotune" for SLOG, where you could probably even look to see what the system's total network interface capacity was, to make sure it wasn't stupidly-small.

This is admittedly less of an issue in today's TRIM-enabled era. My guess is that it would still be a win during stressy periods, because the SLOG device would have a larger pool of free pages that it could burn through very rapidly. Since a dual ten gig can theoretically max out the write capacity of a P3700, and a lot of the other SLOG options are substantially slower than that, it seems like it'd be a win.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
If its any consolation, the proper way to handle this situation is to use the manufacturer's tools to underprovision the size of the drive. For example, Samsung has a tool that lets you resize their SSD lines to smaller sizes as specified by you. I know a few iXsystems customer have underprovisioned 400GB SSDs to 20GB to use as an slog. The SSD is detected as a 20GB drive, and you "fully" partition the drive (fully in quotes since we know it's got a buttload of additional space available) like you normally would from the WebGUI.

I feel like that is the better way to go. Besides, AFAIK the only company that has openly admitted to underprovisioning drives by having smaller partitions is Intel. So unless other manufacturer's make the same claim, there's no way to know what they do or don't do. In fact, Intel hasn't even openly admitted to this feature in more than 2 years, so they may not even support it anymore. But, even if its not supported anymore and by anyone, ideally there's nothing lost but nothing gained.
 
Status
Not open for further replies.
Top