Current SSD recommendations for SLOG?

Status
Not open for further replies.

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
My experiment on surplused Dell equipment convinced me that FreeNAS can work well in my environment, but I've been unable to connect an SSD in a way that increases sync write performance. I'm leaning toward it being a hardware issue and copying a known good configuration for the next step. It costs more, but I'm willing to pay more for a system that can do synchronous writes with reasonable performance, and leave my current test platform as an async iSCSI backup.

The problem is going to be deciding which SSD to use for the SLOG. Intel SLC SSDs don't seem to exist any more. ZeusRAM is available but it's priced like Zeus Himself built it. I don't really know where to look.

What are the current suggestions? And am I correct to assume that SLOG mirrors aren't really all that important nowadays (SSD failure = slowness, but that's about it)?

Thanks again, all.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Aww, I really thought you had gotten that figured out back in November(I think that's when you were trying to solve the problems...). You do realize there's MUCH more to ZFS than throwing hardware at it? I thought you'd have realized that by now.

SLOG mirrors are still important. If you lose the SLOG, where does the data go that's in the bad SLOG when your system has an imappropriate shutdown? Don't take this as an insult, but it sounds like you don't even understand what data goes to the SLOG and what doesn't. The bottom line, if your SLOG fails you suddenly and you need it, you'll lose the data in the SLOG, right? So why would you think that mirrors aren't important?
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
You do realize there's MUCH more to ZFS than throwing hardware at it? I thought you'd have realized that by now.
I don't know why saying "my SSD implementation appears to be broken and I can't test further on my repurposed, non-ideal hardware so I'm going to bite the bullet and do it right. Which SSDs are considered 'doing it right' nowadays?" gets the "you can't simply throw hardware at the problem, dummy!" response.



Here's what I'm seeing: fio in a 70% read /30% write workload against:
  • Identical shares within FreeNAS, other than the underlying
  • One share consists of a single Intel 530 drive
  • The other consists of 4 mirrored ES.3 drives
iops-2-5.jpg


I know what you're going to say: I'm doing it wrong, that test isn't telling me what I think it is, there are better tests to run, and this testing thing is nontrivial, and I should pay someone like you if I want to figure it out. We've gone over that before.

Still, at this point I'm thinking the cheap mount-your-SSD-in-a-PCI-slot card I used as a kludge to get an SLOG in my R710 is probably impacting performance here. If it's not the card, then it's likely the interaction between it and FreeNAS. Since I can't simply insert the SSD into the hot swap bay for diagnostic purposes because my SAS card won't recognize it, either I need a new SAS card, or I should bite the bullet and build a real system using recommended and supported hardware and move forward.

Hence, this post.

(By the way, others have tested the SSD I'm using at ~ 26,000 IOPS in a 80/20 read/write mix, which is really damn close to the testing I'm doing here. It really looks like there's a compatibility problem here somewhere, doesn't it? Or, put another way, should adding an SSD SLOG ever slow down synchronous performance? Because that's what I'm seeing with my hardware in my environment. Is there a more likely cause for this other than hardware issues?)

Don't take this as an insult, but it sounds like you don't even understand what data goes to the SLOG and what doesn't. The bottom line, if your SLOG fails you suddenly and you need it, you'll lose the data in the SLOG, right? So why would you think that mirrors aren't important?
The risk as I see it is minimal. Correct my thinking here:
  1. My data isn't critical. I support web forums like this one.
  2. Loss of an SLOG will result in data loss if the system crashes before the current TXG is written to disk.
  3. The odds of the system failing within 5 seconds of the SLOG crapping out seem pretty minimal to me.
  4. If I do encounter data loss, I've got hourly backups to restore from should fsck fail me.
  5. I've been running asynchronous iSCSI for ~ 5 of the last 6 years without any real issues, even when the iSCSI server failed. I'd prefer to run synchronous, but in case of an SSD failure (and the resulting performance hit) I'm reasonably comfortable setting sync=disabled while a replacement is sent to the datacenter.
Async iSCSI should work just fine in my environment. It's worth the cost to "do it right" for me, though, so I'm willing to spend more for resiliency here, even when it increases complexity like mirrored SLOG seems to do. What I'm not willing to do is buy 2 ZeusRAM to set up as a mirrored SLOG when I haven't proven that I can even run synchronous writes with acceptable performance in my environment.

Hence, the questions.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Still, at this point I'm thinking the cheap mount-your-SSD-in-a-PCI-slot card I used as a kludge to get an SLOG in my R710 is probably impacting performance here. If it's not the card, then it's likely the interaction between it and FreeNAS. Since I can't simply insert the SSD into the hot swap bay for diagnostic purposes because my SAS card won't recognize it, either I need a new SAS card, or I should bite the bullet and build a real system using recommended and supported hardware and move forward.

Can you link to a product page about the aforementioned "mount your SSD in a PCI slot"? If you're talking about a card that's actually bridging the SATA interface on your SSD to a PCIe slot by means of some chip (I'm going to guess a JMicron controller) then I'll go right ahead and lay the blame on that piece of circuitry right there. Is there any way to tap it for power only and route a data cable to an onboard SATA port? The R710 should at least have some onboard designed for boot devices.

The risk as I see it is minimal. Correct my thinking here:
  1. My data isn't critical. I support web forums like this one.
  2. Loss of an SLOG will result in data loss if the system crashes before the current TXG is written to disk.
  3. The odds of the system failing within 5 seconds of the SLOG crapping out seem pretty minimal to me.
  4. If I do encounter data loss, I've got hourly backups to restore from should fsck fail me.
  5. I've been running asynchronous iSCSI for ~ 5 of the last 6 years without any real issues, even when the iSCSI server failed. I'd prefer to run synchronous, but in case of an SSD failure (and the resulting performance hit) I'm reasonably comfortable setting sync=disabled while a replacement is sent to the datacenter.
1. If downtime and an hour of lost data isn't a big risk to you, that's your call. I would say that's a bad idea though.
2+3. Right. You'll still want to schedule some downtime to replace it though.
4. Yay for backups, but ZFS has no fsck. Remember that.
5. Again, that's down to your comfort. Having the hourly backups significantly helps your comfort level I'd imagine.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
honeybadger said:
Can you link to a product page about the aforementioned "mount your SSD in a PCI slot"? If you're talking about a card that's actually bridging the SATA interface on your SSD to a PCIe slot by means of some chip (I'm going to guess a JMicron controller) then I'll go right ahead and lay the blame on that piece of circuitry right there. Is there any way to tap it for power only and route a data cable to an onboard SATA port? The R710 should at least have some onboard designed for boot devices.

Here's the one I went with. There aren't a ton of options.

http://www.amazon.com/gp/product/B0096P62G6

You know, I hadn't considered your option. I've got 1-2 SATA ports on the motherboard but zero power sources. But the card provides some power... Would be cheaper for testing than buying a wholly new system, certainly.

I would say that's a bad idea though.
It doesn't strike me as a "bad idea" necessarily, but I'd argue it's contrary to best practices.

Again, it comes down to pricing on the SSDs, and since Intel doesn't seem to make SLC SSDs any more I'm not sure where I should be looking.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Here's the one I went with. There aren't a ton of options.

http://www.amazon.com/gp/product/B0096P62G6

Having a hard time finding out the bridge chipset used there but it's either a Silicon Image or Marvell. I'd still say that's your problem.

You know, I hadn't considered your option. I've got 1-2 SATA ports on the motherboard but zero power sources. But the card provides some power... Would be cheaper for testing than buying a wholly new system, certainly.

Only question is would the card supply power-only or would it throw some kind of fit. Not totally familiar with the R710 but I thought there was a breakout cable available to get a couple SATA power plugs; I know I've seen them in other Dell and HP rackmounts with hotswaps, designed to dangle a mirrored set of boot drives internally.

It doesn't strike me as a "bad idea" necessarily, but I'd argue it's contrary to best practices.

It really depends on your/your client's tolerance to downtime on the forum. If an hour of time/posts lost isn't dangerous, then by all means go ahead. But with the cost of SSD these days I'd say why not build mirrored SLOG and make it a nonissue?

Again, it comes down to pricing on the SSDs, and since Intel doesn't seem to make SLC SSDs any more I'm not sure where I should be looking.

Take a look at the Intel DC S3700. Yes, it's MLC, but it's hugely overprovisioned out of the box, has an extremely high write endurance, and doesn't cost a small fortune. The "value line" DC S3500 has much less endurance and doesn't make much sense financially vs. other offerings like the Seagate 600 Pro overprovisioned line (100/200/400) which also outperform it, so that's your less expensive choice. Anything below that I'm reluctant to suggest, but I suppose you can go into offerings like the Crucial m500/Samsung 840 Pro, or even the older Intel 320. All of these drives have some manner of power-failure protection which is important for an SLOG.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
I'll look into the 3700 - thanks. One more round of testing (including a trip to the datacenter), maybe a few months of testing on non-production virtual machines, and then I'll build the production machine. At $250 per SSD, a mirrored vdev for SLOG makes sense, though if I opt for L2ARC I'll probably go single there.

Thanks for your help.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I'll look into the 3700 - thanks. One more round of testing (including a trip to the datacenter), maybe a few months of testing on non-production virtual machines, and then I'll build the production machine. At $250 per SSD, a mirrored vdev for SLOG makes sense, though if I opt for L2ARC I'll probably go single there.

Thanks for your help.

No problem. I'm trying to get some time to redo a DL360 here with 9.2.0 - it's currently running a Solaris brew since I was playing with COMSTAR FC targets, but I've got a couple older Intel 320s I can stick in as mirrored SLOG. Can't recall exact numbers but I was definitely beating the tar out of your SSD results. What queue depth were you running fio at?
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
Queue depth was sixteen, which seemed high enough to give me a good feel for load. I'm no expert at this sort of testing though.

Maybe once I rebuild I'll do a complete test and graph it out.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I know what you're going to say: I'm doing it wrong, that test isn't telling me what I think it is, there are better tests to run, and this testing thing is nontrivial, and I should pay someone like you if I want to figure it out. We've gone over that before.
That is true. The best way to gauge performance is to use it like you intend to use it, then look at info from arcstat and other zfs and network tools to see what is going on internally. L2ARCs also take time to "heat up", so a freshly booted server will perform very slowly until it gets going. It's like a Mac truck. Once it gets up to speed its a force to be reckoned with. This is why benchmarks are a lie. It's like the saying goes... "there's liars, damn liars, and benchmarks". If you have a server setup with a good L2ARC its possible to get your benchmark to give you numbers that are basically what the SSD can do if it caches the entire benchmark data in it. This is one of many reasons why benchmarking hybrid setups like ZFS with ZILs and L2ARCs is an art, and a VERY complex art. You're better off looking at how your server actually performs and doing the research to get it where you want.



(By the way, others have tested the SSD I'm using at ~ 26,000 IOPS in a 80/20 read/write mix, which is really damn close to the testing I'm doing here. It really looks like there's a compatibility problem here somewhere, doesn't it? Or, put another way, should adding an SSD SLOG ever slow down synchronous performance? Because that's what I'm seeing with my hardware in my environment. Is there a more likely cause for this other than hardware issues?)

It is absolutely possible for an SLOG to slow down pool performance. This is why buying the right hardware that has the IOPS you need at write size you need, and taking into account other factors such as reliability of the drive itself, if it has a supercapacitor or not, etc factor in. There's tradeoffs. You can go with an Intel SSD model that has a supercap, or you can buy one of those ultra expensive ZeusRAMs. They pretty much smoke SSDs on IOPS, and they have a feature equivalent to a supercap. We've seen people put 7200RPM drives in as a ZIL and then wonder why performance dropped. Your ZIL can potentially be a limiting factor. If you use your ZIL a whole lot, that can turn into a very big deal when the ZIL can't perform.

The risk as I see it is minimal. Correct my thinking here:
  1. My data isn't critical. I support web forums like this one.
  2. Loss of an SLOG will result in data loss if the system crashes before the current TXG is written to disk.
  3. The odds of the system failing within 5 seconds of the SLOG crapping out seem pretty minimal to me.
  4. If I do encounter data loss, I've got hourly backups to restore from should fsck fail me.
  5. I've been running asynchronous iSCSI for ~ 5 of the last 6 years without any real issues, even when the iSCSI server failed. I'd prefer to run synchronous, but in case of an SSD failure (and the resulting performance hit) I'm reasonably comfortable setting sync=disabled while a replacement is sent to the datacenter.
Async iSCSI should work just fine in my environment. It's worth the cost to "do it right" for me, though, so I'm willing to spend more for resiliency here, even when it increases complexity like mirrored SLOG seems to do. What I'm notwilling to do is buy 2 ZeusRAM to set up as a mirrored SLOG when I haven't proven that I can even run synchronous writes with acceptable performance in my environment.

Hence, the questions.

1. Ok
2. Yes.
3. If you are going to make that argument, then just do zfs set sync=disabled and be done with it. There's risks involved, but if you're going to argue the point that 5 seconds worth of data is worth losing, then sync=disabled seems like just as logical a choice. Notice I said "just as logical" and not "logical". The logic for the decision will have to ultimately be made by you based on the value of your pool, etc.
4. There is no fsck for ZFS. Never has been, likely never will be. ZFS scrubs will fix data corruption problems. But the lost data from incompletely written txgs will literally be gone and unrecoverable.
5. This tends to go with #3. If you are willing to go with sync=disabled for a time, why not just do it all the time? iSCSI with its asynchronous mode is similar(but not the same) as doing NFS with sync=disabled. Many people do iSCSI and love it.

From a strictly data protection standpoint, NFS with sync=standard is the "safest". From a strictly performance angle, iSCSI with sync=standard or sync=disabled is the "fastest". As your admin, it's your decision to choose the best way to handle things.

There is no "do it right" for everyone. There's "do it right for me" and "do it right for you". I might do iSCSI, and you might want that extra protection with NFS. If I were in your shoes, based on what you've just wrote, I'd probably do iSCSI with sync=standard and do hourly snapshots to a backup server. If crap hits the fan, then accept the lost hour of data and just roll back to the snapshot(that takes seconds, literally). If the crap REALLY hits the fan and you lose the pool, then restore from backup server(which will take more time but will recover you back to a safe state). And if you keep only 8 hours worth of hourly snapshots, that's still 7 changes to roll back if the first one fails.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
On a related note I was discussing issues just like this in IRC a few nights ago, and on skype last week. ZFS is really getting attention lately. Too many people are upset/pissed/disgusted at the complexity of ZFS. In particular myself and others were discussing how this is going to affect ZFS' adoption. I've already met a few people that have bought pre-made servers that were improperly setup and they lost data as a result. They thought they had redundancy, but because they stacked ZFS on hardware RAID, the whole house of cards came falling down. Other's are complaining about performance because there's so much to learn and so much to "getting ZFS right". Overall, both of these barriers have(and will hurt) ZFS' name in the future. I think it's going to turn into a world of 2 outcomes:

1. ZFS becomes more mainstream and those that get it will love it and those that didn't get it will despise it.
2. ZFS never becomes mainstream because the people that don't get it will be very vocal about their trials and tribulations with it.

Let's be honest, right now you are in #2. You've spent at least 3 months trying to get this to work, you've invested significant money trying to get it to work. And you may turn around and give ZFS the middle finger soon and go with something else. That's completely reasonable and understandable. It would also be completely reasonable if a friend wanted to use ZFS and you were to tell him how much ZFS sucks. I wouldn't fault you for that opinion at all. ZFS does have some serious drawbacks. Either people(or technological advances) will make those prolems go away, or ZFS will always have people claiming performance is horrid.

I've seen horribly designed ZFS servers(not just FreeNAS). They aren't fun to deal with. They're not easy to solve usually because they've done fundamental things wrong that you can't run a command to fix. Even when people want me to help tune their server I tell them to use it as they normally would for a day, then I take a look and make a few tweaks. Then we see how it goes for 24 hours and reevaluate. It's about getting that balance between read and write performance, reliability, and data security. It's no joke. And there has been an occasion or two where I was doing homework because even I felt clueless. This crap just isn't easy. As I've said to many people, I feel like until you've got a 4 year degree in it, you'll always feel stupid.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
Let's be honest, right now you are in #2. You've spent at least 3 months trying to get this to work, you've invested significant money trying to get it to work. And you may turn around and give ZFS the middle finger soon and go with something else. That's completely reasonable and understandable. It would also be completely reasonable if a friend wanted to use ZFS and you were to tell him how much ZFS sucks. I wouldn't fault you for that opinion at all. ZFS does have some serious drawbacks. Either people(or technological advances) will make those prolems go away, or ZFS will always have people claiming performance is horrid.
I get that you feel that way, but I think you're really off-base. (And I've spent about a month on this - January was nothing but distractions. ;) )

Just pause for a second here. I know your job here on the forums is typically to respond the the same 14 complaints. Standard responses are:
  • You should have used ECC memory
  • You don't have enough memory
  • An SLOG is not a write cache and it doesn't work that way
  • No, you can't expand a RAIDZx vdev - you need to rebuild it, or add another vdev to the Zvol
  • Etc.
I totally get it. In my case I think you read the first couple of sentences and pegged me as the sort of user who doesn't research, doesn't read the manual, isn't willing to spend a few hours on Google, and so in.

In my case, with all the posts about this, there has been one central issue:

The SSD is not performing as an SSD should.


When I ask about it, the standard CyberJock response every time is "it's not that easy, you're doing it wrong, don't try to test because you can't, it's non-trivial, your expectations are off," and so on. So I've tried to clearly document, with frigging pictures, the problems I'm seeing in a way that would get past the knee-jerk response. Should my SSD be getting fewer IOPS than a hard drive? Should my array, running NFS sync=standard be slower with an SSD SLOG than with no SLOG? Is it reasonable to expect synchronous iSCSI to run at better than 2% the throughput of asynchronous? Is there anything obviously wrong with my setup, or is it time to look at replacing hardware (a difficult task because the server is 8 hours away from me)?

All questions that have yielded the same Cyberjock answer: you don't get it, you're doing it wrong, that test doesn't do what you think it does, you should pay me, any expectations are unreasonable, this is complex stuff with lots of moving parts that don't respond in intuitive ways, etc.

I'm a bit frustrated, to be honest. You certainly don't owe any forum members anything, and you're doing yoeman's work here, but sometimes broken hardware is broken. And questions about broken hardware shouldn't generate the sort of responses like you've done here in this thread.

====

I've invested 4 hard drives, an SSD, a $30 SAS card, and a cheap SSD/PCI card into this test. Maybe that's significant. I've spent a month trying to figure out why performance is lower than everyone else seems to document when they run comparable systems. If I'd received a response like HoneyBadger offered earlier -- essentially "yeah dude, that sounds screwed up. I agree it's probably hardware" this would have been resolved a while ago. Instead you were telling me that a 50:1 performance differential on nearly empty iSCSI volumes was perfectly reasonable, and that it wasn't an indicator of a problem. I let you dissuade me from replacing (probably bad) hardware for a long time, and that's on me.

But I don't know why you're making me out to be someone who's going to talk shit about ZFS because I haven't been able to make it work (because I'm running incompatible hardware.)

But then, maybe you're still thinking I've just misconfigured the hell out of everything and that's the cause of the problem. That seems to have been your initial impression, and it's not changed...

In the end, ZFS looks like a good tool. I'm impressed with what it can do, and worst case is I'll run it as an iSCSI target and get the benefits of caching. But it should be usable running in synchronous mode, and that's all I'm looking for - usable. And ZFS can work well with synchronous writes, because lots of people here have successfully made it do so. Hell, I hope to do so myself once I work around the SSD issue.

But that "you're in the #2 category" just comes across as a personal attack from someone who tends toward the abrasive, so maybe I'm misreading things.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
In my case, with all the posts about this, there has been one central issue:

The SSD is not performing as an SSD should.

This is the best summary of things; that 530 should be performing a LOT better than you're getting.

So let's circle back here:

Since I can't simply insert the SSD into the hot swap bay for diagnostic purposes because my SAS card won't recognize it

I'm assuming this is a Dell firmware lockout thing. The latest firmware on the PERC H700 (that's what came as the option in the R710, right?) removes the third party disk block, so you should consider updating it. Use a 2.5" to 3.5" enclosure (IcyDock makes some good ones including a SAS-compatible model, WD's IcePack is also good) to line up the ports and whammo, you're in.

Of course, since you're an eight-hour drive from the DC, you don't have the luxury of easy changes, so let's try to determine if something else is the cause here first. Let me know what your network config is (NFS over 1GbE, I assume?) and I'll try to duplicate it. I can probably do that pretty quickly.
 

ZFS Noob

Contributor
Joined
Nov 27, 2013
Messages
129
I'm assuming this is a Dell firmware lockout thing. The latest firmware on the PERC H700 (that's what came as the option in the R710, right?) removes the third party disk block, so you should consider updating it. Use a 2.5" to 3.5" enclosure (IcyDock makes some good ones including a SAS-compatible model, WD's IcePack is also good) to line up the ports and whammo, you're in.

Of course, since you're an eight-hour drive from the DC, you don't have the luxury of easy changes, so let's try to determine if something else is the cause here first. Let me know what your network config is (NFS over 1GbE, I assume?) and I'll try to duplicate it. I can probably do that pretty quickly.
I appreciate that. I actually bought a SAS/6i due to recommendations about not using PERC raid cards, and that's what I'm running right now. Even with all drive bays empty except the one with the SSD loaded it wouldn't recognize, so I gave up on that option.

I'll be out of the loop until Monday, but I'll be happy to continue this then. Yes, it's GigE, with 4 ports LAGGed together. I'm running XenServer which makes me unique; right now for testing I've just got a few volumes shared and a single VM running on the volume that runs fio for benchmarking. I tried running tests from the command line of the FreeNAS server before, but, well, I documented the response I received above.

If there's a better way to test that completely isolates the ZFS performance from the network and any issues that are popping up there I'm happy to run them. I'm feeling pretty confident that the issue is with hardware, though.

Have a great weekend.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
First, this is the last time you will ever get a response from me on this forum. You just don't shut your mouth and listen. Frankly, I probably shouldn't spend the next hour responding to your post, but I will anyway. Just don't expect me to ever respond, so you shouldn't waste your time responding to me.

I get that you feel that way, but I think you're really off-base. (And I've spent about a month on this - January was nothing but distractions. ;) )

Just pause for a second here. I know your job here on the forums is typically to respond the the same 14 complaints. Standard responses are:
  • You should have used ECC memory
  • You don't have enough memory
  • An SLOG is not a write cache and it doesn't work that way
  • No, you can't expand a RAIDZx vdev - you need to rebuild it, or add another vdev to the Zvol
  • Etc.
I totally get it. In my case I think you read the first couple of sentences and pegged me as the sort of user who doesn't research, doesn't read the manual, isn't willing to spend a few hours on Google, and so in.

I read your entire post, beginning to end. How dare you presume I didn't read your post through, yet I spent over 45 minutes writing my response.

How the hell you'd get the idea that I think you are someone who hasn't done research when you've asked some rather detailed questions in the past(if I remember correctly), and I took my time to answer them. So no, wrong. By a mile. And it is interesting that you'll tell me I'm not reading your posts when you aren't even reading mine. I even said "you've spent over 3 months on this". I've been there and seen your problems. I've answers your questions with as much detail as I could provide without devoting my life to your server. What the heck else would you want from me instead!?

My answers are in red while green is stuff I've already said in this thread and literally did a copy and paste from above, but you didn't read it or didn't understand it. I'll let you decide which category it was for yourself because I hadn't presumed much about you until this post except you were determined to solve the problem. Now I presume you really have spent 3 months on information that wasn't useful to you, or you didn't read anything and just started flipping knobs. Again, not sure which.

In my case, with all the posts about this, there has been one central issue:

The SSD is not performing as an SSD should.

When I ask about it, the standard CyberJock response every time is "it's not that easy, you're doing it wrong, don't try to test because you can't, it's non-trivial, your expectations are off," and so on. So I've tried to clearly document, with frigging pictures(pictures mean nothing to me... I want nose-bleeding detail. Pretty graphs show me you've simplified it down to a few watered down numbers that could very well mean nothing), the problems I'm seeing in a way that would get past the knee-jerk response. Should my SSD be getting fewer IOPS than a hard drive? That is possible for certain hardware in certain situations with certain ZFS tunables set. Should my array, running NFS sync=standard be slower with an SSD SLOG than with no SLOG? I said this above.. "It is absolutely possible for an SLOG to slow down pool performance.". I said this above, but you didn't listen to my answer the first time. So I'll say it again. IT IS POSSIBLE TO ADD AN SLOG AND MAKE YOUR SYSTEM SLOWER. Feel free to read that sentence as many times as you want until it sinks in. This is one reason why I have said again and again "you can't throw more hardware at the problem and expect it to work". People don't realize the elegance of my worse, but that's okay. I've seen people spend $5000 on ZILs and L2ARCs and their system performs worse than mine just because they didn't buy appropriate hardware and configure it properly. I had serious deep thought when I wrote in my noobie presentation not to add them until you know you need them. You can actually make things worse! Get it? MORE HARDWARE CAN MEAN SLOWER SERVER!!!!! Not always, and maybe not even usually(for whatever your definition of usually is). Is it reasonable to expect synchronous iSCSI to run at better than 2% the throughput of asynchronous? That is apples to oranges. It depends on a few factors that you have simplified out with your question. And, for FreeNAS, there are no "synchronous writes for iSCSI". At all. FreeNAS' iSCSI implementation has no SYNC write support. You can do something kind of like sync writes with sync=enabled for the pool/dataset. But again, it's not exactly the same.. And if you read my post above I said "iSCSI with its asynchronous mode is similar(but not the same) as doing NFS with sync=disabled" so where you'd think that iSCSI with sync=enabled is the same as what you call "synchronous writes for iSCSI" is a bit beyond me right now(but it gives me an indicator of what your knowledge level is). And if you had stopped and read my post I even said "Is there anything obviously wrong with my setup, or is it time to look at replacing hardware (a difficult task because the server is 8 hours away from me)?

All questions that have yielded the same Cyberjock answer: you don't get it(and you don't get it.. see all that red and green, and my comments like you simplfied down too much?), you're doing it wrong(you're clearly throwing hardware at it and seeing performance go down and are shocked... so yes, you have demonstrated to me you are doing it wrong by the fact that you are even asking the question about if adding an slog can slow down performance), that test doesn't do what you think it does(again, I said that above, so good job at saying what I already said), you should pay me(Find the post where I said "you should pay me". I offer my services to those interested, but I do NOT ever take the stance that you *SHOULD* pay me. And frankly, if you called me right now I'd charge triple just for the "should" comment) But, the fact that you took me offering services to you "should" pay me just demonstrates to me that you likely leap to incorrects conclusion regularly, any expectations are unreasonable(for you, right now, with your level of understanding, very possible, see the end of this post), this is complex stuff with lots of moving parts that don't respond in intuitive ways(yep, and with clear questions being asked like "Should my array, running NFS sync=standard be slower with an SSD SLOG than with no SLOG?" it is clear that you do not understand these non-intuitive ways), etc.

You see, you've validated to me based on your responses the stuff I'm saying. And you're pissed because I'm not writing a book with the answers. I can tell 2 things from your questions and what you say.

1. You have a very very long way to go to understand this stuff.
2. You're pissed and just want the answer. They don't come easy, and that should have been completely clear when I said above...

Even when people want me to help tune their server I tell them to use it as they normally would for a day, then I take a look and make a few tweaks. Then we see how it goes for 24 hours and reevaluate. It's about getting that balance between read and write performance, reliability, and data security. It's no joke. And there has been an occasion or two where I was doing homework because even I felt clueless.

THIS SHIT ISN'T A F'IN JOKE. I EVEN ADMITTED IT'S NOT A ONE AND DONE THING.

I'm a bit frustrated, to be honest. As I said above, completely normal and expected. Even I get frustrated with it, and I even implied it when I said I have to go to the books. And you will continue to be frustrated until you stop and actually read what I said and see that I actually said those things not because everyone does them, but because YOU are doing them, and YOU are demonstrating that you are doing them with what you say and ask. You certainly don't owe any forum members anything, and you're doing yoeman's work here, but sometimes broken hardware is broken. And questions about broken hardware shouldn't generate the sort of responses like you've done here in this thread. And there's a difference between broken hardware and an admin that doesn't understand what he is doing(won't site anything, I think all the red/green above speaks to that).

I've invested 4 hard drives, an SSD, a $30 SAS card, and a cheap SSD/PCI card into this test. Maybe that's significant. I've spent a month trying to figure out why performance is lower than everyone else seems to document when they run comparable systems. If I'd received a response like HoneyBadger offered earlier -- essentially "yeah dude, that sounds screwed up. I agree it's probably hardware" this would have been resolved a while ago. Instead you were telling me that a 50:1 performance differential on nearly empty iSCSI volumes was perfectly reasonable, and that it wasn't an indicator of a problem. I let you dissuade me from replacing (probably bad) hardware for a long time, and that's on me. You know, I can change 2 tunables and make any SSD do 3 IOPS. No joke. I even did it to a friend that had 2x500GB SSD and he knew I was joking with him when I said "Your SSD is broken, its doing 3 IOPS, you should give it to me". I've shown that to people. If you read up on IOPS it's incredibly vague and one IOPS =/= one IOPS. In your case with NFS, you've got the VM's IOPS, the NFS' IOPS, and the zpools IOPS. And this will blow your mind, but one VM IO =/= 1 NFS IO and 1 NFS IO =/= 1 zpool IO. And it appears you can change what an NFS IO is by changing..... wait for it..... your network packet size. Nothing to do with zfs at all, but in some rare cases has a profound affect on performance. So when I say it's more complex than that and other stuff that pisses you off, I'm not talking trash out of my ass. I'm dead serious(or dead wrong). You saying your SSD doesn't get enough IOPS is completely and utterly useless to me. You provided far from enough information to prove that what you are saing is diagnostically valid, at all, or that it your benchmark values should be comparable to his. If you want your IO values to mean something, you must provide detail(that stuff you aren't doing). You'd have to show what you are testing, how you are testing it, what the parameters are, and every piece of information that could affect the test result. This is up to and including things such as ALL of your ZFS parameters, tunables, sysctls, hardware versions and settings(BIOS/SAS/SATA settings, etc.. etc..), settings of those hardware version, your benchmark software version and the settings used. Basically every single little thing that has the propensity to change the outcome of a benchmark. And, if you want to compare your settings to Joe-Blow's settings, you'd need all of his stuff to compare. Suddenly you see that Joe-Blow probably didn't give you all of that information, which is why I knew for 100% certainty back when you first mentioned it here that your numbers meant nothing. And that is precisely why I say things like "This is one of many reasons why benchmarking hybrid setups like ZFS with ZILs and L2ARCs is an art, and a VERY complex art" above! And you clearly haven't grasped enough of the fundamentals to respect the elegance of what I said in just a few words. You may very well have a hardware issue. But you've definitely proven that you don't really understand what you are doing from all the stuff in this post, and I don't just dismiss hardware out of hand, especially when someone doesn't really understand the basics.

And I'm not going to write a book for you. Frankly, it's not worth my 8 hours(if I got lucky only 8 hours) to try to explain your exact situation to you as everyone will want that, and I do have a life. So I offer my services to those that want it. Makes me a little money to afford hardware to learn more about ZFS and help even more people, and I don't spend my whole life typing crap from my keyboard for a bunch of people, 1/2 of which will be ungrateful for the amount of knowledge I have and the number of hours i've spent reading ZFS books, blogs, and what not on this things called ZFS.

But I don't know why you're making me out to be someone who's going to talk shit about ZFS because I haven't been able to make it work (because I'm running incompatible hardware.) It was a hypothetical situation of what might be a possible outcome would be to demonstrate the uphill battle for ZFS in the future. And I even said it would be completely understandable. And while I'm on the topic, stop with that bullshit throwback to "because i'm running incompatible hardware" because frankly, you've failed to leave out details(isn't this the 3rd time I've said this?) like your hardware build for the server, or any information that might be useful. You could have 2GB of RAM for all I know on a Pentium 4. So yet again, and again, and again, you are validating to me(and to those that know this stuff) that your lack of being detail-oriented is almost certainly a problem. DETAILS ARE SUPER IMPORTANT, regardless of what they are for. And when people don't provide every single detail(which most people don't) then its a dead giveaway that they didn't mention some aspects of their setup because they consider them to be "inconsequential". That's wrong pretty close to 100% of the time, including here. When someone shows up and dumps a boatload of details, those details tell me 3 things:

1. They are detail oriented. (very very very very important for ZFS)
2. The are providing their values to show what they are currently using because they know those values are important(although they may not understand all of the relationships)
3. They have done their home to the point of realizing they know those values are important.

No details, then I know where the knowledge and detail level currently is.

But then, maybe you're still thinking I've just misconfigured the hell out of everything and that's the cause of the problem. That seems to have been your initial impression, and it's not changed...You're absolutely right. You've provided no zfs tunables, hardware stats, nothing to even validate that you are doing something remotely reasonable with your system.

In the end, ZFS looks like a good tool. I'm impressed with what it can do, and worst case is I'll run it as an iSCSI target and get the benefits of caching. But it should be usable running in synchronous mode, and that's all I'm looking for - usable. No, it shouldn't "be usable running in synchronous mode" for everyone. The default ZFS tunables are setup for a "good for many people". As soon as you start doing things like NFS with VMWare ESXi, you are not in the same ballpark as the people that fall into the "good for many people" category. It was documented so years ago by Sun! They even said that you may need to tune your server depending on the workload. That will never ever change with ZFS(leading back to my discussion of ZFS' uphill battle in the future). Not to mention the devil is in the details, none of which you've provided. "usable" varies widely from one person to another. I run a single VM, so usable to me is a very low bar. If you ran 30 VMs on my setup you'd be looking for the nearest bridge to jump off of. My server was not designed for heavy I/O, on purpose. I don't need heavy I/O. I want to stream a movie or two in my house and have access to my pictures and documents when I need it. There's also varied definitions of "synchronous" as VMWare's NFS is excessively agressive with the sync writes while Xen isn't. And there's a difference between NFS sync writes and something like iSCSI writes(remember, there is no SYNC writes for FreeNAS' iSCSI) becoming synchronous because of sync=enabled. Small(but important) differences that can add up in some situations(again, not explaining details because I don't want to write a book). I know, probably tuned out already, but that's the reality of it. And ZFS can work well with synchronous writes, because lots of people here have successfully made it do so. (Because they tuned theirs appropriately or had a low bar.. which one was it for them, and which one is it for you? Are you sure your situation and theirs was the same?) Hell, I hope to do so myself once I work around the SSD issue.

But that "you're in the #2 category" just comes across as a personal attack from someone who tends toward the abrasive, so maybe I'm misreading things. You aren't misreading things, but its not a personal attack either. It was simply meant to demonstrate you've been working on this for a while and still have no answer. There's tons of other people just like you out there too. And if the forums are any indicator, 90% of users that have your problem give up and walk away with the answer of "ZFS is sh*t". Guess what, it can certainly be a pile of steaming poop. And as I've said before, its a matter of spending the time to learn how this stuff works. Many won't get it. And if only 1 out of 1000 people gets it then you'll have 999 people complaining while 1 person thinks its great. Guess what message will end up getting around the most? That ZFS sucks. When all but the minority get something, the common opinion is that it sucks.. And if I spent 3 months of my life on a project(any project) and had nothing to show for it at the end, I'd be unhappy. And if someone asked me what I thought about said project, I'd probably tell them it was absolutely horrible and I never want to talk about it again. They'd probably take my opinion and walk away and just say that said project sucks to their friends too. That's not being an ass, that's not unreasonable, that's user reviews of said project. And that's valued, significantly! Surely when you go to buy something online, you read and take the reviewers seriously. Some are people complaining about "UPS was a day late" and give it 1 star,others may give it 1 star and have very valid reasons for complaining.

But to use the "UPS was a day late analogy" above, you're complaints about the SSD being "too slow" is analogous to giving the hard drive 1 star because the UPS guy was a day late. Your argument for why the drive has a problem is not validated by any facts that I can see, your details are non-existent, and your level of knowledge is a bit low for what you are jumping into(hence the frustration).

So here we are at the end of the post. So here's the question for you regarding the whole "expectations" thing. This is rhetorical by the way. How many IOPS do you *think* you should be getting. 5000? 10000? 50000? Now, how many do you *think* you could get if you were a ZFS wizard. 50000? Now, what if I told you that if you did a zpool entirely in RAM you'd only get to 60k or so(got this number from IRC yesterday)? So think about that in relation to "how fast" you think your pool should be, and the orders of magnitude faster RAM is compared to hard drives and the relatively pathetically slow SATA connection.

I have no doubt that given enough time, anyone can figure it out. Some perhaps in a month or two, others might need a few years. The 2 things I had to do when I started reading ZFS was:

1. Stop jumping to any conclusion about the next piece of the puzzle internal to ZFS. Just read the documents and take them for what they are. If they don't say it, then it doesn't happen that way. PERIOD. I made many conclusions at the beginning. You start reading something and immediately trying to put the puzzle together. That's probably the worst thing you could do. You gotta read up on all of this stuff, THEN try putting the puzzle together. My guess is 99% of people that give up want the quick and dirty answer to the puzzle, so they start reading and immediately start trying to put the puzzle pieces together. After all, if you get the puzzle together faster you can go do something else, right? Except nobody gets the puzzle together right the first time. Then, they get angry because they are lost and confused.
2. Better pay serious serious attention to detail with ZFS. ZFS was written with attention to details. This is why ZFS has no other like it. And if you can't get on the same level as the ZFS developers that made it, you'll never figure it out.

Fair winds and following seas to you sir!

If I had a flowchart for how ZFS(excluding anything related to networking) it would probably be more complex than something that looks like this... http://img.scoop.co.nz/stories/images/0907/b7913538ef620feb92dd.jpeg

So don't think this is easy, its not. My answers are intentionally simple(which you and other are no doubt taking the wrong way) because I don't really want to spend my life on questions.

Edit: Now check out this thread that I saw about 10 minutes after yours.. http://forums.freenas.org/threads/bad-performance-with-mirrored-ssds-as-slog.18262/

He provided hardware, he even provided some of the more important tunable values. Just based on that one post he's shown me what he considers to be important tunable values, what they are, what he's done, and his performance. And if you watch I'll ask him some questions too, because he's providing benchmark numbers that aren't apples to oranges. It's not a "great" start, but far far better and with 1/2 the length of your first post. He didn't include a picture either. See the contrast between the posts?
 
Status
Not open for further replies.
Top