RAM requirement @ 60+ TB?

Status
Not open for further replies.

Stanri010

Explorer
Joined
Apr 15, 2014
Messages
81
I'm thinking about upgrading my FreeNAS system in the near future. It'll most likely be a 10 drive NAS with 6 TB drives. Following the 1GB of ram per 1TB of drive space, that would mena I would need 64GB of ram. That's all fine and dandy as LGA1151 Skylake should support up to 64GB of ram.

However, how "hard" of a rule is it when you've already got oodles of ram? For example, if I've got 80TB of storage space is 64GB of ram enough or would I really want to move up to 96GB? Considering 64GB of ram for the platform is the max, this might be a problem.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Have you looked at my system setup lately? (link in my sig)
 

Stanri010

Explorer
Joined
Apr 15, 2014
Messages
81
So based on what I'm reading, if I'm trying to push 10 GbE, I can probably max out a LGA1151 system at 64GB of ram and then probably go with a L2ARC SSD as opposed to spending the extra money on a E5 X79/X99 platform and push to 128GB of ram and still get very good 10 GbE speeds.

I see what you've done Cyberjock but at the same time you've never truly pushed for performance past 1 GbE.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
So based on what I'm reading, if I'm trying to push 10 GbE, I can probably max out a LGA1151 system at 64GB of ram and then probably go with a L2ARC SSD as opposed to spending the extra money on a E5 X79/X99 platform and push to 128GB of ram and still get very good 10 GbE speeds.

I see what you've done Cyberjock but at the same time you've never truly pushed for performance past 1 GbE.

Ok, in my own defense, I should put my NIC on that system specs list. I've got 10Gb to my server (Chelsio). I definitely push it past 1GbE, regularly.
 

Stanri010

Explorer
Joined
Apr 15, 2014
Messages
81
Ok, in my own defense, I should put my NIC on that system specs list. I've got 10Gb to my server (Chelsio). I definitely push it past 1GbE, regularly.


Interesting. what kind of sequential read and writes can you get?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
However, how "hard" of a rule is it when you've already got oodles of ram? For example, if I've got 80TB of storage space is 64GB of ram enough or would I really want to move up to 96GB? Considering 64GB of ram for the platform is the max, this might be a problem.

It's actually a very soft rule as the amount of RAM goes up. At 32GB you can support a 64TB or even somewhat larger pool, but performance may not be as good as a system that has more RAM. There's still an underlying need for you to have enough RAM to maintain a stable system; an 8GB system with 128TB of pool is not going to work out in the end, and the system might even panic and could cause damage to the pool. The real question is where's the line between those two, and it also isn't quantifiable with a hard rule. Sorry.

But while this sounds like it might be an endorsement of "80TB is fine on 64GB", do note that performance out in this area is much more complex and somewhat subjective; if your workload and data stored are stressy, you'll want more RAM. If you're just storing archival data, ISO's, or media files for low volume access, 32GB might be fine for 80TB.
 

Tywin

Contributor
Joined
Sep 19, 2014
Messages
163
There's still an underlying need for you to have enough RAM to maintain a stable system; an 8GB system with 128TB of pool is not going to work out in the end, and the system might even panic and could cause damage to the pool.

Modulo jails, with the majority of system memory being used by ARC, FreeNAS should have a pretty good idea of how much memory it can reasonably allocate to various buffers. It's not a desktop system with a frenetic user picking random applications to load intermittently. The preferred behaviour would be that FreeNAS refuse to mount a pool if it doesn't have enough memory to support it.

Regardless, if the ZFS implementation in FreeNAS were actually solid, OOM events shouldn't ever be able to damage the pool. A core tenet of ZFS (most modern file systems, really) is that the on-disk structure remain viable at all times. With a solid implementation, the worst-case should be loss of in-flight data.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Actually, many filers very closely resemble having frenetic clients picking random files to load intermittently.

No one's been able to quantify exactly what "enough memory to support" a pool actually is, in any case, because the reality is so dependent on so many variables. With some of the older ZFS versions, we definitely saw panicks if ZFS got pushed too hard on a too-small system. I'm not sure we're actually seeing that sort of thing anymore, but a terribly performing filer is nearly useless, so while it may not be "dangerous" it is still a stupid thing to do. Just like you can run a modern web browser on a 440BX system with a gig of memory, being *able* to do something successfully is different than it being *useful* to do so.

You are cordially invited to be the tester of this sort of thing and identify those boundaries, otherwise, feel free to stop complaining about the failure to do work you're basically implying you'd like other people to do.

And that's a big thing to remember. This is an open source project. Opinions are like butts, everyone has one. Your random opinion of how things should work is not a valuable contribution to the project. When I want to make a contribution, I try to ponder a problem I've identified from the viewpoint of a new(ish) user, and then fill in what I can based on several decades of expertise.

As a relevant example, when we were noticing that 4GB and 6GB pools seemed to be trashing pools due to causes that none of the victims could identify, I took some time and studied what was being seen, and recognized that no reasonable 8GB configurations were suffering in this manner. I eventually modified the FreeNAS docs and handbook to reflect that. It wasn't happening to most users, and I wasn't able to replicate it, but rather than merely sit and whine about what *ought* to be, I took a UNIX old-timer's pragmatic solution to the problem.

My opinion is that we never positively identified what was causing corruption on small memory systems. That's several years behind us, and it may well have been fixed by some upstream code changes, feature additions, etc. I don't recall hearing of anyone getting a corrupted pool merely to too-small memory in recent years, but that /may/ be because Cyberjock and the rest of the gang pretty much come down real hard on the "that's not expected to work" line. So the conservative thing to do is to assume that it could still be a problem, and I will not advise people to do things that I think could be damaging to their pool or their data. I don't have the time or inclination to study the problem in detail and effectively try to prove a negative.

Again, I cordially invite you to do the work if it bothers you. Otherwise, pointless wishfulness on your part.
 

Tywin

Contributor
Joined
Sep 19, 2014
Messages
163
Actually, many filers very closely resemble having frenetic clients picking random files to load intermittently.

Sure, but that's why there's an ARC eating most of the memory. If your ARC is big enough to support your workload, performance is good; if it's not, performance is poor (yes, as you mention below, potentially so poor as to be not practically useful).

No one's been able to quantify exactly what "enough memory to support" a pool actually is, in any case, because the reality is so dependent on so many variables. With some of the older ZFS versions, we definitely saw panicks if ZFS got pushed too hard on a too-small system. I'm not sure we're actually seeing that sort of thing anymore, but a terribly performing filer is nearly useless, so while it may not be "dangerous" it is still a stupid thing to do. Just like you can run a modern web browser on a 440BX system with a gig of memory, being *able* to do something successfully is different than it being *useful* to do so.

Yeah, it was mostly the comment that it might eat your pool to which I was referring. Of course you can't extract more performance from a system than it has to give you. There may be areas for further optimization, but that's neither here nor there.

You are cordially invited to be the tester of this sort of thing and identify those boundaries, otherwise, feel free to stop complaining about the failure to do work you're basically implying you'd like other people to do.

And that's a big thing to remember. This is an open source project. Opinions are like butts, everyone has one. Your random opinion of how things should work is not a valuable contribution to the project. When I want to make a contribution, I try to ponder a problem I've identified from the viewpoint of a new(ish) user, and then fill in what I can based on several decades of expertise.

Please don't invent intentions for me, and please don't confuse the issues that I discuss. The only complaints I have are when esteemed members of this board spread fear, uncertainty, and doubt about things that either no longer exist, or that really should be looked at in detail (or at least have open bugs documenting what exactly it is that we do know).

Everything else* is merely the observations of a systems engineer regarding FreeNAS's rough corners. Yes, these observations are my opinions, but they aren't random opinions. They are opinions derived from a model of the FreeNAS system as a whole, broad-strokes understanding of the underlying subsystems, and critical thinking on the usability of the system and interfaces between subsystems. Of all the rough edges, any mention of potential pool corruption stands out, as this is the one thing FreeNAS should never do: corrupt your pool. I'm sorry that you seem take these observations as a personal affront, they certainly aren't intended to be.

* Quick aside: I write tersely not because I am angry or frustrated, I just generally don't care for unnecessary verbiage. It lowers the signal to noise ratio. There are far more words in this reply than I believe necessary, but you seem to be -- or at least I have parsed your post to be -- pissed off, which frankly I don't think there's any need to be. I am trying to appease you and get across the idea that, generally speaking, I don't give a shit about the drama, I post almost exclusively from a technical standpoint.

As a relevant example, when we were noticing that 4GB and 6GB pools seemed to be trashing pools due to causes that none of the victims could identify, I took some time and studied what was being seen, and recognized that no reasonable 8GB configurations were suffering in this manner. I eventually modified the FreeNAS docs and handbook to reflect that. It wasn't happening to most users, and I wasn't able to replicate it, but rather than merely sit and whine about what *ought* to be, I took a UNIX old-timer's pragmatic solution to the problem.

My opinion is that we never positively identified what was causing corruption on small memory systems. That's several years behind us, and it may well have been fixed by some upstream code changes, feature additions, etc. I don't recall hearing of anyone getting a corrupted pool merely to too-small memory in recent years, but that /may/ be because Cyberjock and the rest of the gang pretty much come down real hard on the "that's not expected to work" line. So the conservative thing to do is to assume that it could still be a problem, and I will not advise people to do things that I think could be damaging to their pool or their data. I don't have the time or inclination to study the problem in detail and effectively try to prove a negative.

Please believe me, I understand how frustrating a problem can be when you can't replicate it. That is the worst kind of open-ended engineering problem, and the most annoying type of debugging. That said, as you mention yourself, no configurations running 8 GiB of RAM exhibited the pool corruption issue. Why then suggest that a configuration running 8 GiB of RAM with a 128 TiB pool could corrupt your data? In these reports from several years ago, was the corruption ever correlated with pool size, or was it just on systems with less than 8 GiB of RAM regardless of pool size?

Again, I cordially invite you to do the work if it bothers you. Otherwise, pointless wishfulness on your part.

My only "wish" is that FreeNAS be the best product it can be. It would sadden me if someone such as yourself thought that wish was pointless.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Interesting. what kind of sequential read and writes can you get?

Depends on many factors..

Windows 7 to/from the server.. 350 MB/sec or so over CIFS.
Linux Mint 17 to/from the server with CIFS... 450-550MB/sec.
Linux Mint 17 to/from the server over NFS... 500-600MB/sec.
 

Stanri010

Explorer
Joined
Apr 15, 2014
Messages
81
Depends on many factors..

Windows 7 to/from the server.. 350 MB/sec or so over CIFS.
Linux Mint 17 to/from the server with CIFS... 450-550MB/sec.
Linux Mint 17 to/from the server over NFS... 500-600MB/sec.

That would suggest that you're single threaded bottlenecked?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
That would suggest that you're single threaded bottlenecked?

No, that suggests that there's a lot of variables in things. When you start going for "what is the fastest" there's variables that are outside your control. If I do a ramdrive to ramdrive and one side is the FreeNAS server I can do over 900MB/sec on CIFS. I'm guessing that the way Samba tries to read the data off the pool is different from NFS and that small latency makes a big difference.

The real question is "where is the bottleneck?" and the real answer is "I don't care. It's plenty fast enough to make me happy." I never intended to saturate 10Gb LAN. In fact, I didn't add the 10Gb LAN until almost a year after I built this server. I only have an E3-1230v2, and I have only 32GB of RAM. I also have a single vdev of 10x6TB drives. It was never built with the intention of trying to saturate 10Gb. It was for storing my data safely while giving me some respectable performance on Gb LAN. The 10Gb speeds were ultimately just a bonus.

When you are going for maximum performance you really can't simplify things down. You have to look at all aspects and figure out where that one little detail is the bottleneck.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Everything else* is merely the observations of a systems engineer regarding FreeNAS's rough corners. Yes, these observations are my opinions, but they aren't random opinions. They are opinions derived from a model of the FreeNAS system as a whole, broad-strokes understanding of the underlying subsystems, and critical thinking on the usability of the system and interfaces between subsystems. Of all the rough edges, any mention of potential pool corruption stands out, as this is the one thing FreeNAS should never do: corrupt your pool. I'm sorry that you seem take these observations as a personal affront, they certainly aren't intended to be.

I'm a professional UNIX systems engineer by trade, and I've been working with FreeBSD since before 386BSD 0.1.

Your opinion is noted but not deemed meaningful. There /is/ such a thing as understanding appropriateness to a task.

Let me explain in pictures.
p1.png can pull p2.jpg .

p3.jpg can pull p4.jpg .



j5.jpg can pull j6.jpg .

Now, in theory you can hitch p4.jpg up to j5.jpg and try to give it a pull, and you might even be able to move it, but ultimately the stresses caused are going to ruin the car engine and you are going to suffer a catastrophic failure.

There is not a hard boundary, and you can slowly ramp up the trailer sizes and probably still be successful up to a point where the car struggles to climb a steep hill, or just a hill, etc. But the increases in trailer size correspond to an increased risk of catastrophic damage to the engine and vehicle.

That said, as you mention yourself, no configurations running 8 GiB of RAM exhibited the pool corruption issue.

I did not say that. I said that no REASONABLE 8GB configuration was exhibiting crashes, etc. We've always considered a reasonable configuration to be one that lives within the general realm of configuration guidelines.

Why then suggest that a configuration running 8 GiB of RAM with a 128 TiB pool could corrupt your data? In these reports from several years ago, was the corruption ever correlated with pool size, or was it just on systems with less than 8 GiB of RAM regardless of pool size?

It's because I recognize that the likely culprit that causes panicks would be kernel resource starvation or exhaustion, and that ZFS makes some vague assumptions about sizing of this-and-that as a function of memory size, and requires certain quantities of resources in order to operate effectively.

Put, more simply, I just automatically know that j5.jpg cannot pull p2.jpg in any meaningful way, and am not really interested in continuing this thread. I promise virtually nothing at all if you try to make such a pairing between a tiny server and a huge pool. I don't promise it'll crash. I don't promise it'll corrupt. But, more importantly, I don't promise it'll work, and I know that I wouldn't trust my data to such a pairing.

I can tell you that I've put a 30TB pool on a 6GB system and pounded the crap out of it trying to make bad things happen, but aside from being about 1/3rd the speed of the same system with 32GB RAM, it seemed fine. I'm not really terribly interested in running configurations in production that might be problematic, so basically I'm capable of looking at the underlying issues and saying "I see why ZFS needs memory, the general guidelines are X, I am fairly confident based on experience and observation that I can bend this at least to Y, but that Z represents an unacceptable risk."

I have no problem if you want to take and do a deep analysis on the technical issues underlying this, and you're again encouraged to do so. Many years ago I had such energy and drive. These days I mostly just want our operations to work well and for stuff not to crap all over the place. I don't feel a compelling need to fully understand the exact knife blade point at which ZFS might fail as long as I can design a system that never strays anywhere near that cutoff point. I know how to be certain I'm nowhere near that point.

I'm sure the automotive people have complicated graphs of just how much weight a VW Bug can pull up various inclines, and how failure rates increase as those numbers are manipulated. That's great. I don't need that sort of insight. I just buy a heavier car if I want to pull a bigger load, because I just *know* a VW Bug can only safely pull a light load without significant risk.

Again, I have nothing against it (and indeed welcome it) if you want to dive deeply into things to find a more precise answer to the questions you have. But it is work you'll need to do yourself.
 
Status
Not open for further replies.
Top