Upgrading High End Hardware with P5800X

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Would writing files via NFS put them into ARC implicitly?

obviously.jpg
 

Chris Tobey

Contributor
Joined
Feb 11, 2014
Messages
114
So as an immediate action, I have removed my P4800X SLOG from poolA (SG1), disabled sync on poolA and poolB, and added the P4800X as an L2ARC on poolB - mostly because I could do all of this remotely and with no downtime.

The sync-writes were kept as a safety net previously, but different other improvements may negate their need. Power to the building had been upgraded and is more stable, the server is on 2 x 2000VA UPS and is set to shutdown when they reach critical levels. It has never frozen/locked up, nor been shutdown unexpectedly. Additionally, the data itself is all small code files, 99.99% of which is from VCS (git/svn) or artifacts of CI (jenkins). Most things should be recoverable or reproducible from the past day with only minimal loss in a failure, as long as the rest of the pool survives.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
So as an immediate action, I have removed my P4800X SLOG from poolA (SG1), disabled sync on poolA and poolB, and added the P4800X as an L2ARC on poolB - mostly because I could do all of this remotely and with no downtime.

Seems sensible. Keep an eye on the amount of I/O going to the L2ARC device, please. You've been doing a good job of providing helpful information so far, so I don't want you to think I'm criticizing that when I say that the lack of clarity about the overall pool behaviour has been a little bit of a frustration in this thread. It isn't your fault. It's just that getting in there and doing some tinkering of this sort is actually VERY helpful to further characterize the needs.

If you start seeing a lot of I/O towards the L2ARC (read being the MOST interesting thing, writes should happen regardless), then you can make a reasonable guess about the benefits of increased ARC size.
 

Chris Tobey

Contributor
Joined
Feb 11, 2014
Messages
114
I assume the larger the L2ARC, the better, and the faster the L2ARC, the better?
So the 400 GB P4800X is good, but a 1.6 TB P5800X would be better.

It was also not immediately obvious to me that the L2ARC would be pool-specific, so having one for each pool would be a possible upgrade here.
 

Chris Tobey

Contributor
Joined
Feb 11, 2014
Messages
114
Seems sensible. Keep an eye on the amount of I/O going to the L2ARC device, please. You've been doing a good job of providing helpful information so far, so I don't want you to think I'm criticizing that when I say that the lack of clarity about the overall pool behaviour has been a little bit of a frustration in this thread. It isn't your fault. It's just that getting in there and doing some tinkering of this sort is actually VERY helpful to further characterize the needs.

If you start seeing a lot of I/O towards the L2ARC (read being the MOST interesting thing, writes should happen regardless), then you can make a reasonable guess about the benefits of increased ARC size.
I think when I can schedule some downtime, doubling the DDR4 and therefore the ARC will help, but running analysis over the next few days under heavy load to see how this new configuration performs will be helpful.
If my ARC was only ~146 GiB before, the extra 349 GiB L2ARC should see notable improvement if your analysis holds true.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Seems sensible. Keep an eye on the amount of I/O going to the L2ARC device, please. You've been doing a good job of providing helpful information so far, so I don't want you to think I'm criticizing that when I say that the lack of clarity about the overall pool behaviour has been a little bit of a frustration in this thread. It isn't your fault. It's just that getting in there and doing some tinkering of this sort is actually VERY helpful to further characterize the needs.

If you start seeing a lot of I/O towards the L2ARC (read being the MOST interesting thing, writes should happen regardless), then you can make a reasonable guess about the benefits of increased ARC size.

Any chance there's a thread you'd refer to as a 'gold standard' for the kinds of information that should be provided to ask clear questions..?

It always takes me a long time to write out a question, and even then, I'm anxious as to whether I've provided the right information.
If there is a thread you'd suggest (I'm not saying you have this kind of time) knowing why each thing is useful would also be nice.
If there's a sticky on this or something -- I'd be happy to read it.

If nothing else, perhaps what you mean by providing greater clarity on the pool's behavior.
What kind of steps would you take were you at a terminal window for his (or any) pool you wanted to better characterize?

Thanks in advance
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
thread you'd refer to as a 'gold standard' for the kinds of information that should be provided to ask clear questions..?

No. There's really a reason that there are professional storage engineers, and a large part of it is because customers so frequently don't know what information to provide, what things are relevant, or, most often, what their systems are actually really even doing. That's why storage vendors have engineers that evaluate your needs for you, and even then, experienced people sometimes get it wrong.

By comparison, here in the forums, we're doing this, for free, outside of any sort of customer relationship -- aside from those few posters with "iXsystems" badging on their account, we're all community participants -- and most of us are doing it because we enjoy it. I am mostly here to teach, as I'm on the far side of a long career with all sorts of UNIX, storage, and networking. I enjoy helping people make things work. But the fact of the matter is, this tends to be really difficult stuff, from my side it is like looking through the keyhole in a door to see if I can make out what the servers on the other side are doing. This is nobody's fault. It's the nature of the beast. It is why I often generalize and try to look at big picture issues. When things don't seem to be making reasonable forward progress, we just kinda trawl around, maybe hitting on the issue, sometimes not.

It always takes me a long time to write out a question

And it always takes me a long time to write out a response! ;-)

It isn't always clear what the limiting factor is, as @ChrisRJ noted some posts back. Most posters arriving here with issues are struggling from a lack of experience, and we're struggling because we see certain facets of the problem based on what they've figured out to tell us. Some of the remaining facets will not be particularly obvious to ANY of us, just because you're working on a fairly large system with a complex workload. There may be some critical bit, though, where someone here has an "aha!" moment, or, at least, a (possibly random) suggestion that results in us seeing a facet that we weren't seeing.

I've had this experience several times in recent months where I am talking with somebody about poor performance, I ask them to run iperf and check other network stuff, seems to come back clean, and nevertheless, in the end, the problem ends up being an ethernet switch or bad cable. Follow your instincts. Ask questions. The regular posters here will redirect you to resources that seem appropriate. Everyone knows there are eye-melting levels of annoying devilish details.

I started writing my collection of stickies, most of which can now be found in "Resources", as a way to redirect posters to hopefully beginner-level introductions to some of these complex topics. Overall analysis and tuning isn't really going to be one of these, I think. From my perspective, there are really only two things ZFS is good at. One is database/block storage/high IOPS, for which you want mirrors, and most of the clues are in the block storage sticky. The other is large archival file storage, such as ISO files, which RAIDZ excels at. Everything else is somewhere in between those two extremes, but you don't really get anything other than those two basic choices for pool design, which drives everything else...
 
Top