First, this is the last time you will
ever get a response from me on this forum. You just don't shut your mouth and listen. Frankly, I probably shouldn't spend the next hour responding to your post, but I will anyway. Just don't expect me to ever respond, so you shouldn't waste your time responding to me.
I get that you feel that way, but I think you're really off-base. (And I've spent about a month on this - January was nothing but distractions. ;) )
Just pause for a second here. I know your job here on the forums is typically to respond the the same 14 complaints. Standard responses are:
- You should have used ECC memory
- You don't have enough memory
- An SLOG is not a write cache and it doesn't work that way
- No, you can't expand a RAIDZx vdev - you need to rebuild it, or add another vdev to the Zvol
- Etc.
I totally get it. In my case I think you read the first couple of sentences and pegged me as the sort of user who doesn't research, doesn't read the manual, isn't willing to spend a few hours on Google, and so in.
I read your entire post, beginning to end. How dare you presume I didn't read your post through, yet I spent over 45 minutes writing my response.
How the hell you'd get the idea that I think you are someone who hasn't done research when you've asked some rather detailed questions in the past(if I remember correctly), and I took my time to answer them. So no, wrong. By a mile. And it is interesting that you'll tell me I'm not reading your posts when you aren't even reading mine. I even said "you've spent over 3 months on this". I've been there and seen your problems. I've answers your questions with as much detail as I could provide without devoting my life to your server. What the heck else would you want from me instead!?
My answers are in red while green is stuff I've already said in this thread and literally did a copy and paste from above, but you didn't read it or didn't understand it. I'll let you decide which category it was for yourself because I hadn't presumed much about you until this post except you were determined to solve the problem. Now I presume you really have spent 3 months on information that wasn't useful to you, or you didn't read anything and just started flipping knobs. Again, not sure which.
In my case, with all the posts about this, there has been one central issue:
The SSD is not performing as an SSD should.
When I ask about it, the standard CyberJock response every time is "it's not that easy, you're doing it wrong, don't try to test because you can't, it's non-trivial, your expectations are off," and so on. So I've tried to clearly document,
with frigging pictures(pictures mean nothing to me... I want nose-bleeding detail. Pretty graphs show me you've simplified it down to a few watered down numbers that could very well mean nothing), the problems I'm seeing in a way that would get past the knee-jerk response. Should my SSD be getting fewer IOPS than a hard drive?
That is possible for certain hardware in certain situations with certain ZFS tunables set. Should my array, running NFS sync=standard be
slower with an SSD SLOG than with no SLOG?
I said this above.. "It is absolutely possible for an SLOG to slow down pool performance.". I said this above, but you didn't listen to my answer the first time. So I'll say it again. IT IS POSSIBLE TO ADD AN SLOG AND MAKE YOUR SYSTEM SLOWER. Feel free to read that sentence as many times as you want until it sinks in. This is one reason why I have said again and again "you can't throw more hardware at the problem and expect it to work". People don't realize the elegance of my worse, but that's okay. I've seen people spend $5000 on ZILs and L2ARCs and their system performs worse than mine just because they didn't buy appropriate hardware and configure it properly. I had serious deep thought when I wrote in my noobie presentation not to add them until you know you need them. You can actually make things worse! Get it? MORE HARDWARE CAN MEAN SLOWER SERVER!!!!! Not always, and maybe not even usually(for whatever your definition of usually is). Is it reasonable to expect synchronous iSCSI to run at better than 2% the throughput of asynchronous?
That is apples to oranges. It depends on a few factors that you have simplified out with your question. And, for FreeNAS, there are no "synchronous writes for iSCSI". At all. FreeNAS' iSCSI implementation has no SYNC write support. You can do something kind of like sync writes with sync=enabled for the pool/dataset. But again, it's not exactly the same.. And if you read my post above I said "iSCSI with its asynchronous mode is similar(but not the same) as doing NFS with sync=disabled" so where you'd think that iSCSI with sync=enabled is the same as what you call "synchronous writes for iSCSI" is a bit beyond me right now(but it gives me an indicator of what your knowledge level is). And if you had stopped and read my post I even said "Is there anything obviously wrong with my setup, or is it time to look at replacing hardware (a difficult task because the server is 8 hours away from me)?
All questions that have yielded the same Cyberjock answer: you don't get it
(and you don't get it.. see all that red and green, and my comments like you simplfied down too much?), you're doing it wrong
(you're clearly throwing hardware at it and seeing performance go down and are shocked... so yes, you have demonstrated to me you are doing it wrong by the fact that you are even asking the question about if adding an slog can slow down performance), that test doesn't do what you think it does
(again, I said that above, so good job at saying what I already said), you should pay me
(Find the post where I said "you should pay me". I offer my services to those interested, but I do NOT ever take the stance that you *SHOULD* pay me. And frankly, if you called me right now I'd charge triple just for the "should" comment) But, the fact that you took me offering services to you "should" pay me just demonstrates to me that you likely leap to incorrects conclusion regularly, any expectations are unreasonable
(for you, right now, with your level of understanding, very possible, see the end of this post), this is complex stuff with lots of moving parts that don't respond in intuitive ways
(yep, and with clear questions being asked like "Should my array, running NFS sync=standard be slower with an SSD SLOG than with no SLOG?" it is clear that you do not understand these non-intuitive ways), etc.
You see, you've validated to me based on your responses the stuff I'm saying. And you're pissed because I'm not writing a book with the answers. I can tell 2 things from your questions and what you say.
1. You have a very very long way to go to understand this stuff.
2. You're pissed and just want the answer. They don't come easy, and that should have been completely clear when I said above...
Even when people want me to help tune their server I tell them to use it as they normally would for a day, then I take a look and make a few tweaks. Then we see how it goes for 24 hours and reevaluate. It's about getting that balance between read and write performance, reliability, and data security. It's no joke. And there has been an occasion or two where I was doing homework because even I felt clueless.
THIS SHIT ISN'T A F'IN JOKE. I EVEN ADMITTED IT'S NOT A ONE AND DONE THING.
I'm a bit frustrated, to be honest.
As I said above, completely normal and expected. Even I get frustrated with it, and I even implied it when I said I have to go to the books. And you will continue to be frustrated until you stop and actually read what I said and see that I actually said those things not because everyone does them, but because YOU are doing them, and YOU are demonstrating that you are doing them with what you say and ask. You certainly don't owe any forum members anything, and you're doing yoeman's work here, but sometimes broken hardware is broken. And questions about broken hardware shouldn't generate the sort of responses like you've done here in this thread.
And there's a difference between broken hardware and an admin that doesn't understand what he is doing(won't site anything, I think all the red/green above speaks to that).
I've invested 4 hard drives, an SSD, a $30 SAS card, and a cheap SSD/PCI card into this test. Maybe that's significant. I've spent a month trying to figure out why performance is lower than everyone else seems to document when they run comparable systems. If I'd received a response like HoneyBadger offered earlier -- essentially "yeah dude, that sounds screwed up. I agree it's probably hardware" this would have been resolved a while ago. Instead you were telling me that a 50:1 performance differential on nearly empty iSCSI volumes was perfectly reasonable, and that it wasn't an indicator of a problem. I let you dissuade me from replacing (probably bad) hardware for a long time, and that's on me.
You know, I can change 2 tunables and make any SSD do 3 IOPS. No joke. I even did it to a friend that had 2x500GB SSD and he knew I was joking with him when I said "Your SSD is broken, its doing 3 IOPS, you should give it to me". I've shown that to people. If you read up on IOPS it's incredibly vague and one IOPS =/= one IOPS. In your case with NFS, you've got the VM's IOPS, the NFS' IOPS, and the zpools IOPS. And this will blow your mind, but one VM IO =/= 1 NFS IO and 1 NFS IO =/= 1 zpool IO. And it appears you can change what an NFS IO is by changing..... wait for it..... your network packet size. Nothing to do with zfs at all, but in some rare cases has a profound affect on performance. So when I say it's more complex than that and other stuff that pisses you off, I'm not talking trash out of my ass. I'm dead serious(or dead wrong). You saying your SSD doesn't get enough IOPS is completely and utterly useless to me. You provided far from enough information to prove that what you are saing is diagnostically valid, at all, or that it your benchmark values should be comparable to his. If you want your IO values to mean something, you must provide detail(that stuff you aren't doing). You'd have to show what you are testing, how you are testing it, what the parameters are, and every piece of information that could affect the test result. This is up to and including things such as ALL of your ZFS parameters, tunables, sysctls, hardware versions and settings(BIOS/SAS/SATA settings, etc.. etc..), settings of those hardware version, your benchmark software version and the settings used. Basically every single little thing that has the propensity to change the outcome of a benchmark. And, if you want to compare your settings to Joe-Blow's settings, you'd need all of his stuff to compare. Suddenly you see that Joe-Blow probably didn't give you all of that information, which is why I knew for 100% certainty back when you first mentioned it here that your numbers meant nothing. And that is precisely why I say things like "This is one of many reasons why benchmarking hybrid setups like ZFS with ZILs and L2ARCs is an art, and a VERY complex art" above! And you clearly haven't grasped enough of the fundamentals to respect the elegance of what I said in just a few words. You may very well have a hardware issue. But you've definitely proven that you don't really understand what you are doing from all the stuff in this post, and I don't just dismiss hardware out of hand, especially when someone doesn't really understand the basics.
And I'm not going to write a book for you. Frankly, it's not worth my 8 hours(if I got lucky only 8 hours) to try to explain your exact situation to you as everyone will want that, and I do have a life. So I offer my services to those that want it. Makes me a little money to afford hardware to learn more about ZFS and help even more people, and I don't spend my whole life typing crap from my keyboard for a bunch of people, 1/2 of which will be ungrateful for the amount of knowledge I have and the number of hours i've spent reading ZFS books, blogs, and what not on this things called ZFS.
But I don't know why you're making me out to be someone who's going to talk shit about ZFS because I haven't been able to make it work (because I'm running incompatible hardware.) It was a hypothetical situation of what might be a possible outcome would be to demonstrate the uphill battle for ZFS in the future. And I even said it would be completely understandable. And while I'm on the topic, stop with that bullshit throwback to "because i'm running incompatible hardware" because frankly, you've failed to leave out details(isn't this the 3rd time I've said this?) like your hardware build for the server, or any information that might be useful. You could have 2GB of RAM for all I know on a Pentium 4. So yet again, and again, and again, you are validating to me(and to those that know this stuff) that your lack of being detail-oriented is almost certainly a problem. DETAILS ARE SUPER IMPORTANT, regardless of what they are for. And when people don't provide every single detail(which most people don't) then its a dead giveaway that they didn't mention some aspects of their setup because they consider them to be "inconsequential". That's wrong pretty close to 100% of the time, including here. When someone shows up and dumps a boatload of details, those details tell me 3 things:
1. They are detail oriented. (very very very very important for ZFS)
2. The are providing their values to show what they are currently using because they know those values are important(although they may not understand all of the relationships)
3. They have done their home to the point of realizing they know those values are important.
No details, then I know where the knowledge and detail level currently is.
But then, maybe you're still thinking I've just misconfigured the hell out of everything and that's the cause of the problem. That seems to have been your initial impression, and it's not changed...
You're absolutely right. You've provided no zfs tunables, hardware stats, nothing to even validate that you are doing something remotely reasonable with your system.
In the end, ZFS looks like a good tool. I'm impressed with what it can do, and worst case is I'll run it as an iSCSI target and get the benefits of caching. But it should be usable running in synchronous mode, and that's all I'm looking for - usable.
No, it shouldn't "be usable running in synchronous mode" for everyone. The default ZFS tunables are setup for a "good for many people". As soon as you start doing things like NFS with VMWare ESXi, you are not in the same ballpark as the people that fall into the "good for many people" category. It was documented so years ago by Sun! They even said that you may need to tune your server depending on the workload. That will never ever change with ZFS(leading back to my discussion of ZFS' uphill battle in the future). Not to mention the devil is in the details, none of which you've provided. "usable" varies widely from one person to another. I run a single VM, so usable to me is a very low bar. If you ran 30 VMs on my setup you'd be looking for the nearest bridge to jump off of. My server was not designed for heavy I/O, on purpose. I don't need heavy I/O. I want to stream a movie or two in my house and have access to my pictures and documents when I need it. There's also varied definitions of "synchronous" as VMWare's NFS is excessively agressive with the sync writes while Xen isn't. And there's a difference between NFS sync writes and something like iSCSI writes(remember, there is no SYNC writes for FreeNAS' iSCSI) becoming synchronous because of sync=enabled. Small(but important) differences that can add up in some situations(again, not explaining details because I don't want to write a book). I know, probably tuned out already, but that's the reality of it. And ZFS can work well with synchronous writes, because lots of people here have successfully made it do so.
(Because they tuned theirs appropriately or had a low bar.. which one was it for them, and which one is it for you? Are you sure your situation and theirs was the same?) Hell, I hope to do so myself once I work around the SSD issue.
But that "you're in the #2 category" just comes across as a personal attack from someone who tends toward the abrasive, so maybe I'm misreading things.
You aren't misreading things, but its not a personal attack either. It was simply meant to demonstrate you've been working on this for a while and still have no answer. There's tons of other people just like you out there too. And if the forums are any indicator, 90% of users that have your problem give up and walk away with the answer of "ZFS is sh*t". Guess what, it can certainly be a pile of steaming poop. And as I've said before, its a matter of spending the time to learn how this stuff works. Many won't get it. And if only 1 out of 1000 people gets it then you'll have 999 people complaining while 1 person thinks its great. Guess what message will end up getting around the most? That ZFS sucks. When all but the minority get something, the common opinion is that it sucks.. And if I spent 3 months of my life on a project(any project) and had nothing to show for it at the end, I'd be unhappy. And if someone asked me what I thought about said project, I'd probably tell them it was absolutely horrible and I never want to talk about it again. They'd probably take my opinion and walk away and just say that said project sucks to their friends too. That's not being an ass, that's not unreasonable, that's user reviews of said project. And that's valued, significantly! Surely when you go to buy something online, you read and take the reviewers seriously. Some are people complaining about "UPS was a day late" and give it 1 star,others may give it 1 star and have very valid reasons for complaining.
But to use the "UPS was a day late analogy" above, you're complaints about the SSD being "too slow" is analogous to giving the hard drive 1 star because the UPS guy was a day late. Your argument for why the drive has a problem is not validated by any facts that I can see, your details are non-existent, and your level of knowledge is a bit low for what you are jumping into(hence the frustration).
So here we are at the end of the post. So here's the question for you regarding the whole "expectations" thing. This is rhetorical by the way. How many IOPS do you *think* you should be getting. 5000? 10000? 50000? Now, how many do you *think* you could get if you were a ZFS wizard. 50000? Now, what if I told you that if you did a zpool entirely in RAM you'd only get to 60k or so(got this number from IRC yesterday)? So think about that in relation to "how fast" you think your pool should be, and the orders of magnitude faster RAM is compared to hard drives and the relatively pathetically slow SATA connection.
I have no doubt that given enough time, anyone can figure it out. Some perhaps in a month or two, others might need a few years. The 2 things I had to do when I started reading ZFS was:
1. Stop jumping to any conclusion about the next piece of the puzzle internal to ZFS. Just read the documents and take them for what they are. If they don't say it, then it doesn't happen that way. PERIOD. I made many conclusions at the beginning. You start reading something and immediately trying to put the puzzle together. That's probably the worst thing you could do. You gotta read up on all of this stuff, THEN try putting the puzzle together. My guess is 99% of people that give up want the quick and dirty answer to the puzzle, so they start reading and immediately start trying to put the puzzle pieces together. After all, if you get the puzzle together faster you can go do something else, right? Except nobody gets the puzzle together right the first time. Then, they get angry because they are lost and confused.
2. Better pay serious serious attention to detail with ZFS. ZFS was written with attention to details. This is why ZFS has no other like it. And if you can't get on the same level as the ZFS developers that made it, you'll never figure it out.
Fair winds and following seas to you sir!
If I had a flowchart for how ZFS(excluding anything related to networking) it would probably be more complex than something that looks like this...
http://img.scoop.co.nz/stories/images/0907/b7913538ef620feb92dd.jpeg
So don't think this is easy, its not. My answers are intentionally simple(which you and other are no doubt taking the wrong way) because I don't really want to spend my life on questions.
Edit: Now check out this thread that I saw about 10 minutes after yours..
http://forums.freenas.org/threads/bad-performance-with-mirrored-ssds-as-slog.18262/
He provided hardware, he even provided some of the more important tunable values. Just based on that one post he's shown me what he considers to be important tunable values, what they are, what he's done, and his performance. And if you watch I'll ask him some questions too, because he's providing benchmark numbers that aren't apples to oranges. It's not a "great" start, but far far better and with 1/2 the length of your first post. He didn't include a picture either. See the contrast between the posts?