high write latency on iSCSI - possible after 9.3

Status
Not open for further replies.

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
HI.

I have been running iSCSI with the old iSCSI stack before 9.3 without an SLOG and have been fine ( because the writes were async ).

Now I set-up a new rig with 9.3 ( using CTL this time, of course ) and the write performance on my ex 5.5U2 datastore mounted over iSCSI via 1GBPS network ( mtu 9000 ) is showing 169 millisecond write performance and 6 millisecond read performance ( latency ).

Am i missing something ? does CTL do SYNC writes by default now instead of async? I think the high latency is messing with my backup appliance, thats why I'm asking :)
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
CTL does not change the default. By default the policy is set by ZFS and writes are still asynchronous. But CTL supports more cache control primitives then istgt supported, including DPO/FUA bits and cache control mode page. It allows initiator to request synchronous I/O if it wants to. I am n0t sure that VMWare uses any of those primitives, but I guess it may pass through those flags from virtual machine.

If for some reason somebody wish to override that behavior -- he can always set sync=disabled for specific ZVOL, but then data consistency is not guarantied.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Actually, things are much more complex than that.

CTL is kernel-mode iSCSI. It performs better when it is zvol based and performs worse when file-based.

The old istgt was userland iSCSI and performed better when file based.

So if you were doing file-based iSCSI extents then you were seeing the best performance you could get with pre-9.3 setup. But now that you have upgraded to 9.3 you are in the worst combo, and the best choice is to move to zvol based. Obviously this isn't easy and can't be done without destroying one to create the other.

I'm not saying that's your problem, but there's a bunch of machinations at work when you moved from pre-9.3 to 9.3, so you'll have to narrow them down.

It's very possible your MTUs are causing network problems too. We strongly discourage jumbo frames because they cause far far more problems than they do with gains. We've even got a sticky where we "stick it to the MTU" because it doesn't matter. It's one of those things where I have to laugh when people use jumbo frames because that was a best practice 10 years ago, but today it's almost a "worst practice".
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's one of those things where I have to laugh when people use jumbo frames because that was a best practice 10 years ago, but today it's almost a "worst practice".

That might be a little unfair. However, if I've got 1GbE and want more speed, it is easier to contemplate the jump to 10GbE for about the same amount of annoyance/trouble/etc...
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
Interesting remark about jumbo frames, I will sure test. I'm not concerned about the speed as in "throughput" more concerned about latency because I never saw it that high with my previous iSCSI arrays.

By the way, I am indeed using ZVOLS. I will test using ASYNC on dataset just to see if indeed vmware could be passing some extra parameters to force SYNC. of course maybe an SLOG is the way to go here but i will test before I consider it.
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
okay i did a test with crystaldiskmark on a volume mounted to a VM via ESXi and i get 95 MB/S writes ... that means to me that sync writes are not happening. That is probably a bad thing since most people who deploy iSCSI with ESXi and freenas are not going to even look into enabling sync=always on a dataset and their data my get lost. How come CTL still doesn't respect the ESXi requests to write data as sync ? like NFS does ...

This still doesn't answer why i have not ideal latency but maybe thats because i'm using low RPM WD RED drives ...
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
ESXi requests to write sync over iSCSI? Justify that claim please.
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
CTL respects all existing SCSI synchronization primitives (Caching Mode Page, SYNCHRONIZE CACHE commands, and FUA bit). If after that writes are not synchronous, then it means it was not asked by initiator.
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
thats still weird ... why would ESXi ask NFS to write sync but not iSCSI ... I guess i'll just set sync always on my important volumes that have an SLOG

About justifying my claim ... ESXi doesn't know what writes are supposed to be sync coming form a VM so it has to treat all writes as sync ... isn't that WHY you can run windows on a crappy H300 controller and be fine, but try to run ESXI on it and its slower than hell without a controller with a BBU cache ?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Pretty sure I wrote the sticky on that one.

In any case, my recollection is that a variety of iSCSI gear does all sorts of stupid stuff if an initiator requests sync writes. By comparison, it is hard to screw up sync NFS. There's a tunable to tell ESXi to request sync over iSCSI, I just don't recall the details offhand.
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
Pretty sure I wrote the sticky on that one.

In any case, my recollection is that a variety of iSCSI gear does all sorts of stupid stuff if an initiator requests sync writes. By comparison, it is hard to screw up sync NFS. There's a tunable to tell ESXi to request sync over iSCSI, I just don't recall the details offhand.
Interesting ... i never seen that post, i only seen the post that explains why iSCSI is fast versus NFS lol
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
Oh here is another question while i'm troubleshooting this... I hear that large block size sync writes ( over 64K ) go direct to the storage and bypass the SLOG or ZIL ... is there a way to increase this size ?
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
wait i'm confused. So if i change the block size on the ZVOL to 256K for example ...then you are saying any writes that are under 256K will still go to the ZIL ( SLOG SSD in my case ) but write requests that are over 256K will go direct to storage and bypass zil ?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
wait i'm confused. So if i change the block size on the ZVOL to 256K for example ...then you are saying any writes that are under 256K will still go to the ZIL ( SLOG SSD in my case ) but write requests that are over 256K will go direct to storage and bypass zil ?

I don't even know how you came to that conclusion at all. But not even close (and considering you can't go over 128k block size in ZFS you're already hypothesizing an impossible scenario). :P
 

reqlez

Explorer
Joined
Mar 15, 2014
Messages
84
well why is she saying that you change block size! anyway, i make impossible possible !!! I'll just go in and modify ZFS source code and get 256kb block size going... not that hard ;-)

no but for real ... is that thing i heard about larger than 64kb NFS writes bypassing the ZIL is true or is that something SunOS related ???
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
well why is she saying that you change block size! anyway, i make impossible possible !!! I'll just go in and modify ZFS source code and get 256kb block size going... not that hard ;-)

no but for real ... is that thing i heard about larger than 64kb NFS writes bypassing the ZIL is true or is that something SunOS related ???

Yes, in certain circumstances, writes that are >32KB an marked as sync writes from NFS will by pass the slog and go straight to the pool.
 
Status
Not open for further replies.
Top