Another person banging his head against iSCSI

Status
Not open for further replies.

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
Running 9.2.1.7, specs in sig

Bog standard iSCSI config using all defaults
Presenting iSCSI to an ESXi environment (2x5.5 hosts controlled by Vcenter)
Dedicated iSCSI pnic and vmnic going to dedicated single vswitch and dedicated single normal switch, no jumbo frames. Dedicated interface as well on Freenas tied to the target.

Have tried both a zvol and a device extent.

Speed with zvol on dedicated SSD: ~2MBytes/sec. Yes, pathetic
Speed with device extent on dedicated SSD: around the same.....
I am observing speeds with systat -ifstat 1 and graphs. Just deploying a thin provisioned 500Mb disk (thin size) takes like 15 minutes or something ridiculous.

I can max gigabit with normal CIFs to ZFS so I know its not a general performance/RAM/CPU/NIC driver issue.

Have tried ticking the experimental option and multithreading, no dice.
Tried raising block size to 1024, no dice.

Really reaching out for help as its basically unusable in this state. Its only a lab environment fortunately but I did spend a lot of $$$ on Freenas so if I can't even use it for iSCSI labbing thats a huge black mark.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Striped mirrors for your pool for max iops and one of your ssd's as an under provisioned slog is a start. If its just a lab, sync=disabled may get you where you need to go. If it's production or real data, not a good idea. It's the forced syncs that are killing you not the speed of the pool, moving the extent to ssd doesn't gain all that much.

It's a starting point anyway. It is a shame there isn't a really good answer for this. Even the uber spec'd machines don't really seem to do that great. But you'll do better.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
iSCSI doesn't have sync... so no, not gonna see a bonus. If the intention is to go iSCSI for VMs, your hardware is wholly inadequate. You're going to need a new CPU (one that supports >32GB of RAM), another motherboard, and ALOT more RAM.

I won't even consider VMs for FreeNAS systems with <64GB of RAM.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Meh. Throw a fusion-io or a properly set up slog ala a controller with bbu cache on a 16GB machine and the only difference will be when ZFS hits the cache with 64GB when it can't on 16GB. Needs more RAM is over simplifying. He isn't trying to run 100 VM's. Just get reasonable throughput on one. ESXi definitely waits for an ack or sync of sort as soon as it can't trust controller cache. The new experimental mode should get us around the multi-thread issue with istgt. The guys with uber hardware still screw this up and get dismal results. Set it up well and then start pulling ram sticks. With a small number of VMs the performance is NOT going to change much. The excess ram isn't in play until we scale. ZFS only uses it to cache and these loads are not cacheable. (I'm happy to be corrected. Just what I've seen.)

64GB+ pretty much limits you to e5-Xeon. There are crappy consumer NASs that do better than 2MBs.

It should be reasonable to expect that there is a standard speed maxed Haswell hardware on a well configured machine can hit. If I throw 10 mirrored SSDs at the pool and the fasted slog I can easily get say (Intel 3700). What can I get out of iscsi on FreeNAS for 1 measely VM... ? We can scale from there. I've done full ssd pools, I can throw way more RAM at a different box (for me only i7 not e5), but a maxed E3 shouldn't suck.

I don't mind taking the lashing for the team, cyber. But sometimes an answer beyond, grab an e5 with 128GB should be considered. I maxed a Haswell, threw fast ssd's at mine, optimized my pool reasonably (not raid0), fast slog... it does ok. What would be great, is a solution that does the VERY best we can on this hardware. 1GBe should be the bottleneck. I left the fight for another day, and just stuffed ssds in the esxi box. But that doesn't scale like this should. I always hope to learn better tweaking and performance from the gurus around here. Not just stuff the box with more $.

There are bits and pieces in threads, but no really definitive answers. I don't consider myself qualified to do anything beyond throw hardware at it and tweak my own gear. But even if it's working well, I always hope a specialist could make it do better.
 
Last edited:

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
Thanks, yes my thoughts exactly. I'm trying to run a handful of test VMs just to play with VM failover scenarios (HA, FT etc.) and labbing. This is not 'prod', I was expecting at least what 40-50MB/sec i.e. something remotely tolerable???
My crappy QNAP with an ARM chip was doing better than this. WAY better, at least it was usable.
1-2MB/sec is ridiculous.
Also I've tried direct device extent, that takes ZFS out of the picture, still just as crap, so its something else.

I'll research sync=disabled
 

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
did this, no effect.

zfs set sync=disabled iscsi/iscsi-zvol

On the bright side, this suggests the problem is not related to ZFS?
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
I was thinking of sync=standard vs sync=always. What you are working around is esxi waiting for confirmation that the write has been committed. It can't see controller cache, but zfs will return that commit as quick as it can if there is an slog.

Admittedly, I am studying my ass off on this not a guru. There is a huge difference with a fast ssd slog. But still not impressive speeds, imho. So there is more to the story.
 

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
I'm leaning towards the hosts. I just figured out the driver I slipstream installed to use the onboard Nic overwrites the standard e1000 driver so the intel dual Nic I'm using is also using this bastard driver. Going to do a fresh install taking the onboard Nic out but now the dual should be using the legitimate driver. It just seems too coincidental that no amount of freenas or zfs or block extent makes any difference and the baseline is so poor.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
did this, no effect.

zfs set sync=disabled iscsi/iscsi-zvol

On the bright side, this suggests the problem is not related to ZFS?

Isn't that what I said above?

/me scrolls up

Yep.. it IS what I said above! But don't mine me. Go ahead and listen to mjws00. It's not like you're the first person in the world to ever want to do iSCSI for a single VM and still complained about poor performance (hint: You're like the 1000th this year because you aren't understanding the basics).
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
So you are going with "impossible" on an E3 for a single VM, cyber? Not one in 1000 and all the experts around here can pull reasonable performance off? That is sad, and almost unbelievable.

I certainly value your opinion and expertise and always hope you'll enlighten vs. a quick brush-off. It's damn hard to get a nibble :) Your time is yours, but I am always optimistic.

In any case there has to be a MAX achievable speed on a expertly configured system. At least at that point there would be a reasonable response such as... a 32Gb Haswell can get (x)MBps through put and (x) IOPs on 6 mirrored striped WD Reds, with a 300Mbps or better SLOG. There is no baseline, no reasonable means for someone to move forward.

I know you've tested, and played with this. I can only conclude you determined it wasn't worth the fight? How fast is iscsi on the mini? Unbearable? I was hoping to learn something and love it when you speak up.

Thanks.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
So you are going with "impossible" on an E3 for a single VM, cyber? Not one in 1000 and all the experts around here can pull reasonable performance off?

I'm not an expert?

But seriously, the problem is that what *I* might be able to do with that hardware and what you are *capable* of doing in 6 months of dedicated work may not be the same thing (and almost certainly aren't the same thing). So yes, I do write off alot of "remote possibilities" because it's totally unreasonable to think that you're going to dedicate months of your time on something that could simply be solved with $1500. It's like your ex-wife wanting you dead but instead of hiring a hitman she's banking on that 1 in a trillion chance you'll happen to be killed by a falling meteorite from space. Not likely and not something you should depend on.

In any case there has to be a MAX achievable speed on a expertly configured system. At least at that point there would be a reasonable response such as... a 32Gb Haswell can get (x)MBps through put and (x) IOPs on 6 mirrored striped WD Reds, with a 300Mbps or better SLOG. There is no baseline, no reasonable means for someone to move forward.

The problem is that things like "IOPs" means absolutely *NOTHING*. Not even trying to be funny. What is an IOP? Can you define it? Hint: You can't. The problem is that the definition requires you to say "IOP of iSCSI" or "IOP of a zpool". They are NOT the same thing so you can't throw out IOPS.

You also can't throw out throughputs because it depends on factors that are going to be unique to your situation. How full is your pool? How fragmented is your pool? Are you doing kernel mode iSCSI or the current istgt? Are you doing file based or zvol.

So no, I can't throw out iops and I can't throw out a throughput. In fact, *nobody* can except for that very specific hardware and for that very specific setup.

So if you want quantifiable numbers it's easy...

A 32GB Haswell can do up to 1 billion IOPS and up to 1 billion TB/sec. Are these numbers unrealistically high? Not at all, because they mean as much to you a what userX could claim. These "benchmark" units people use like MB/sec and IOPs are all a farse because they aren't easily transferable.

This is why people that do file servers are professional file server guys. Their job is to handle all these little intricacies that I'm not about to teach and you aren't going to learn in just a weekend of reading. No joke, I had started working on a FreeNAS book. It would certainly be 500+ pages and would only begin to brush on many topics that you might have to understand deeply to be able to "optimize" a given server.

I know you've tested, and played with this. I can only conclude you determined it wasn't worth the fight? How fast is iscsi on the mini? Unbearable? I was hoping to learn something and love it when you speak up.

I didn't even try to do iSCSI. Why? Because of what I said above.... "iscsi performance is crap until you go >64GB of RAM". I didn't have 64GB of RAM and my purpose wasn't to see just how crappy a Mini is for iSCSI. Not to mention all those "intricacies" that you are possibly unaware of are crucial. Just going from istgt and file-based extent to kernel-mode iscsi and zvol is a wide range in changes in performance that range from 20% to 250% or more and depends on a bunch of "intricacies" unique to your setup. Oh yeah, and even if you choose kernel-mode iscsi and zvol (which should be the fastest) you can screw up one setting in the WebGUI and make it so incredibly slow you'd swear you had a hardware failure.

So no, when I say things like "64GB of RAM or bust" I'm serious to 99% or so. If you are idiotically stupid and plan to spend a year of your life trying to figure out how to make it work with 16GB of RAM, there's a non-zero probability you'll achieve it. Meanwhile you could have spent $1500 on hardware that would do the job. Guess which one is the logical solution for the problem? ;)

The *real* question is "what is the fastest way to achieve the desired result". Nobody wants to spend money they don't have to, and nobody wants to believe that they couldn't find all of the answers in 2-3 months. Yet there's dozens of users that have spent 4+ months before they gave up. I'm glad those people didn't work for me because it was easier to buy the new hardware than to pay some employee to do "research" for 4+ months and still end up empty-handed when a solution was required.

So take my advice, or don't. The path of least resistance is very well documented on this forum by users who have spent absurd amounts of time trying to make it work with horribly inappropriate hardware. If you think you're going to beat the odds feel free to continue down the path. Meanwhile I'll simply site the graveyard of "my iscsi sucks balls and I want speed! Please help!" threads that have existed for 2+ years and are still unsolved because the person eventually gave up.

Feel free to reacquaint yourself with slides 49 and 50 of my presentation where I specifically discuss this exact topic. And the last two bullets sum up this thread nicely...
  • You can expect that the issue will not be resolved quickly by just making a few changes and a reboot.
  • Most users will find that they will spend a month or more of intensive research and testing to resolve performance issues on iSCSI if you have never tuned ZFS before.
  • You can expect to have very high server hardware requirements if you use a lot of iSCSI devices.
Now if you'll excuse me I will promptly unsubscribe to this thread because this issue is not something I'm open to debate about. The facts are the facts, whether you like them or not. And I discussed this topic plenty in my noobie guide so I wouldn't have to hash this out "yet again" on the forums.
 

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
Victory. My hunch re: NIC drives turned out to be correct.

Below is fully acceptable for me - around 600M out of a gig on a zvol. I wonder what its like in a device extent now lol.


igb1 in 70.035 MB/s 70.764 MB/s 4.351 GB
out 3.225 MB/s 3.258 MB/s 1.260 GB

So I guess that proves it is possible to get decent performance in a test environment without requiring E5-Xeons or 64Gb RAM.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Yeah.. start using it. Just like I said above, throughput =/= throughput.

You'll see.

Good luck to all of you. I've seen these "alleged" victories before and you'll find out you didn't actually achieve victory over anything except your ability to make some numbers look good but not actually see a change in performance. :)

Oh, and just to ruin your fun.. 600MB will fit in your ZFS cache.. so you were almost certainly testing ZFS' cache and nothing else. Make it a 1TB test and you'll see much different numbers. Oh, and hate to break it to you but raw throughput means nothing without the IOPs to back it up. :)

But do keep going. I enjoy being proven correct. :)
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Thanks cyberjock, didn't need that much explanation. But it is useful to see your viewpoint. I wasn't digging deep on iscsi when I read your presentation. Your pointer to jgreco's bug report leads to a whole WEALTH of goodness and decent documentation on the issues. https://bugs.freenas.org/issues/1531 *Follow the links there is a lot of gold zfs/iscsi tuning. There are good numbers shown on 8GB haswells and poor numbers on 128Gb E5 with 11 spindles and vice versa.

As I mentioned earlier... like yourself, I moved on from the fight. I tried to throw enough gear at it to isolate a configuration issue. But did not write it off as impossible. I like to see people succeed, so if I can help on easy stuff I'll try and hope the big guns do as well.

Good luck wintermute.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Oh, and just for the record we've seen two benchmark applications run side-by-side on iSCSI and one wouldn't hit 100MB/sec while the other would do almost 600MB/sec (10Gb LAN). So the way in which the benchmark test runs can be a MAJOR factor. This user was upset because the benchmark they wanted to use gave crap numbers and when we used our benchmark application it was up to snuff. They continued to demand that their program was superior and wanted it tuned to make their benchmark tool provide good numbers. We finally convinced him to do real-world tests... and he had about 600MB/sec, just like we had expected.
 

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
real world? ubuntu VM feels not much slower than on a local disk. For purposes of shuffling VMs around for lab testing, vmotion, FT etc. thats good enough for me.

Put it this way, I was labbing earlier on NESTED ESXis, going to a virtualised Freenas rofl. I just want my setup to be faster than that. Pretty sure its capable. You're not even reading those numbers correctly, i was saying that the raw counters show I was throughputting 600Mbits, not that I did a 600 Meg test. I actually deployed a 10Gb template. Then I ran it for a bit. Then I deployed another VM, then ran that for a bit. In amongst it I glanced at systat, and copied the stat into my post above. Previously, it would take 15 minutes to just deploy a 500M template, and then another minutes to even boot the damned thing. Now the performance is not really much worse than the local rust platters.

So no, I'm not aiming high in any way whatsoever. You can save your preaching for someone else. I drive routers, switches and call managers for a living so I'm not a stranger to enterprise grade hardware or problems/solutions.
 
Last edited:

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
Yep...yet you don't share what you "learned" for others....
 

wintermute000

Explorer
Joined
Aug 8, 2014
Messages
83
What are you taking about. I wrote earlier I suspected a driver issue with a slipstream e1000 vib on my esxi hosts and then later confirmed that was the case.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
The part I find ridiculous is that it is supposedly so hard to find standards. For the most part we are talking about config files and command line commands. It isn't very hard to get those consistent or to share them. Show the cmd you ran for the test... attach the conf. Done. No guessing, no grey areas, easy.

Most of the more serious folks are running some pretty consistent variant of a Supermicro Haswell or Avoton. I'm probably the freak running pure Intel gear (no local Supermicro trusted vendor). The fact is the platforms and their performance is well known and a microslice of hardware compared to what most OSs are dealing with. New hdds are consistent in their speeds. SSD's and HBA's are known quantities. We can ignore the other hardware and/or come up with reasonable performance variance specs.

What I want to know is if I give iXsystems a Haswell E3 for a month, some 'Reds', SSD's, HBA . How fast can they make iscsi run serving a bone stock disk to esxi, windows, pick any dist or livecd you want. The specific target OS doesn't matter make it BSD so we have consistent tools on both sides, or linux for ease. Create a known load with consistent cmds or scripts to show IOPS and throughput on the 'real world' system.

With that in place I can reliably and consistently compare my very own set up to the baseline. I can swap a faster pool and measure, I can change an slog and measure, I can up the ram or swap a cpu and measure, I can tweak a setting and measure. At that point we can look at specific workloads and we still have a comparative baseline, and proper rational for adding hardware etc. If I know jgreco, cyberjock, or iX can get 80MBps throughput on that hardware and I'm getting 2MBps I have a problem... If I'm getting 79MBps I might want to move on. Same for whatever measure of IOPs/latency is selected.

Would sure be nicer than flailing in the dark. I realize all that testing and documentation puts some to sleep. But some of us do optimize and document things. Some of us are even pretty good at it. As long as people are derided for trying to optimize systems, this platform will have limited usefulness and scope. I would gladly contribute this had I the expertise. Easy for me on windows servers or hardware solutions, but not comfortable enough on BSD/ZFS. There are lots of professionals like myself that would love to see wide implementation and acceptance of FreeNAS or TrueNAS. Rock solid baselines and known good configurations are necessary to even begin evaluation.

I know. Pipe dream. I ain't normal. ;)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
We can ignore the other hardware and/or come up with reasonable performance variance specs.

Despite what you think, no, you can't. The problem is the type of workload is a major factor and with a single tunable I can often make a server appear to be twice as fast on benchmarks (not real-world mind you) or I can make it do 1MB/sec. So no, when you have that kind of "range" for performance with just one tunable there is no such thing as 'reasonable' performance variations except for what you'd see with lots of experience (and guess where I come up with many of my "that's good" or "that's not good" info from?). Not to mention you can buy 2 NICs that are the same model but different firmware and can see real-world performance change, sometimes by as much as 40%.

What I want to know is if I give iXsystems a Haswell E3 for a month, some 'Reds', SSD's, HBA . How fast can they make iscsi run serving a bone stock disk to esxi, windows, pick any dist or livecd you want. The specific target OS doesn't matter make it BSD so we have consistent tools on both sides, or linux for ease. Create a known load with consistent cmds or scripts to show IOPS and throughput on the 'real world' system.

1. Well, what numbers do you want? In all seriousness you can make the server provide almost any number you want. You can literally cookbook the benchmarks to whatever value you "want" it to have.
2. The target OS is "hellatiously" important! So when you say the target OS doesn't matter, it matters... bigtime. What you get in Linux will NOT be what you get in Windows even if you are using the same benchmark tool ported to each.

With that in place I can reliably and consistently compare my very own set up to the baseline. I can swap a faster pool and measure, I can change an slog and measure, I can up the ram or swap a cpu and measure, I can tweak a setting and measure. At that point we can look at specific workloads and we still have a comparative baseline, and proper rational for adding hardware etc. If I know jgreco, cyberjock, or iX can get 80MBps throughput on that hardware and I'm getting 2MBps I have a problem... If I'm getting 79MBps I might want to move on. Same for whatever measure of IOPs/latency is selected.

Would sure be nicer than flailing in the dark. I realize all that testing and documentation puts some to sleep. But some of us do optimize and document things. Some of us are even pretty good at it. As long as people are derided for trying to optimize systems, this platform will have limited usefulness and scope. I would gladly contribute this had I the expertise. Easy for me on windows servers or hardware solutions, but not comfortable enough on BSD/ZFS. There are lots of professionals like myself that would love to see wide implementation and acceptance of FreeNAS or TrueNAS. Rock solid baselines and known good configurations are necessary to even begin evaluation.

I know. Pipe dream. I ain't normal. ;)

No, you aren't "abnormal". You just are failing to realize the complexity of the situation. The situation is way nastier than you realize and trying to do a "cookie-cutter" approach will just not work for this. There's a reason why *nobody* in ZFS has these kinds of numbers. You can't possibly factor in all aspects of a setup. This is why TrueNAS is a single hardware platform and they only use very specific models of hardware and nothing else. Then they *can* account for many of the variables. But you and me... we don't have a snowball's chance in hell of ever account for the variables.
 
Status
Not open for further replies.
Top