Slow read speed on iSCSI

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Oh and also i get around 5 gbps if that matters. Shouldn't bottleneck it though
1643902547954.png


Oh about those drivers, I never manually installed any drivers. Should I?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I'm using iSCSI as stated in the original post. Yes it is RAIDZ1. And while I don't how well openzfs scales, I'd assume probs around 500-600 MB/s which lines fairly close up with the write speed. The results in the original post were also set to be averaged over 9 tests

In retrospect, yes, the thread title probably should have tipped me off ... whoops.

Regarding the difference, anything that can be pulled from the read cache (ARC) will come in as fast as your network can shovel it through, but when you have to go to disk it's going to be slower, especially when using block storage on RAIDZ as that works much better on mirror vdevs for a variety of reasons that are in the "Path to success for block storage" that @jgreco linked previously.

You can band-aid this by putting a bigger ARC buffer (32GB is better than 16GB, but not as good as 64GB) of course, but that won't help in a scenario where you have to go to spindles. Once you hit DDR4 (probably even DDR3 on SNB or newer) the speed of RAM doesn't matter nearly as much as the quantity, so whatever the least-expensive-but-still-reliable option you have is probably the best.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
You can band-aid this by putting a bigger ARC buffer (32GB is better than 16GB, but not as good as 64GB) of course, but that won't help in a scenario where you have to go to spindles. Once you hit DDR4 (probably even DDR3 on SNB or newer) the speed of RAM doesn't matter nearly as much as the quantity, so whatever the least-expensive-but-still-reliable option you have is probably the best.
Oh great. Well in that scenario the best value appears to be a stick of G.Skill 32GB 2666 MHz CL19. I'll probs get one of those to start with then buy another later on. Would that be fine?
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
What you're likely seeing here is a fine illustration of the trade-off between "throughput vs IOPS" that you get by tuning recordsize. Datasets default to 128K (which will be beneficial to the larger sequential I/O of the 1M tests) whereas zvols default to 16K (which are better for the small 4K tests)

Also, with sync=disabled that RMS-200 is doing absolutely nothing other than heating the room by 25W. I assume this setting was only used for the tests?

Yeah, I intend to revert to SYNC & re-test. If it remains unimproved from my last results, I'll try that ZIL in another device or get rid of it.

Radian says the RMS-200's performance is...
  • Over 1.3 Mil. IOPS – 4K Random Writes
  • Over 5.5GB/s – 128K Random Writes
I think PCIe 3.0 can do ~1GB/sec per lane or 8GB in the x8 slot ...
But, I've also seen performance vary (probably background tasks running, etc)

Do record sizes affect IOPs or just the efficiency of the size of the data and the space consumed ..?

(I haven't read the entire thread so I'll likely have additional replies as I read your other posts)
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
I don't know how crap the TX401 is. My boss doesn't have unlimited money either, so I am usually required to research stuff that works well before any purchase orders are signed. Speculative purchases of unproven hardware are relatively rare. However, I will note that the TX401 is basically the Aquantia junk, which is basically to 10G what Realtek or Marvell is to 1G. It isn't supposed to be tragically unusable, but it is aimed at the PC market, and developers have called its driver poor quality in the past (I think the words may have been "development quality"). This card is probably fine for a client PC where you just want to do a little web browsing or faster-than-1Gbit file transfer, but given that the driver is only about two years old and even its authors don't seem to love it, and it is really intended to be a "cheap" solution for PC clients supporting PCIe x4 and RJ45 copper, it is easy to see that this thing doesn't hold a candle to a used Solarflare SFN6122 that costs a third of the money, or a used Intel X520/710 that costs about the same, sports fiber, and has drivers written and tuned for years by a dedicated driver team at their respective companies. So from my perspective, I could have spent the same money you did and received something known to work very well.

As for memory, iSCSI is a weird thing, and as @HoneyBadger notes, it isn't really an apples-to-apples comparison.

I haven't tried a SolarFlare, but I have tried the
Mellanox ConnectX-3
Chelsio T320 (can't recall exactly ver)
Myricom PCIe 2.0 version
And a newer Chelsio T520-SO-CR ...
Of which, all of them have yielded over 850MB/s ...

And of which were via SMB ... using the ABYSMAL CPU that came with the Dell T320
which is 1.8GHz, has no HyperThreading nor Turbo. lol. (E5-2403v2)

Though, now I'm curious as it pertains to my R730xd ...

Should I upgrade the ...
E5-2640v3 - 8c, 2.6GHz | 3.4GHz (T) ..... with an
E5-2643v3 - 6c, 3.4GHz | 3.7GHz (T) 20MB Cache ($60 ea on eBay..?)

(The E5-2600v4 are a lot more and I assume isn't worth about 200-300% the v3 pricing)

I ask bc I've been told that SMB prefers high clock speeds ... the 2640v3
 
Joined
Dec 29, 2014
Messages
1,135
I ask bc I've been told that SMB prefers high clock speeds ... the 2640v3
I believe Samba is still single threaded. Based on that belief, I have always opted for the high clock speed CPU I can get in the highest class available. My current FreeNAS units are both running dual Intel(R) Xeon(R) CPU E5-2637 v4 (4 cores) @ 3.50GHz with 256GB of memory. My leaning has also taken me towards CPU's with less cores since there are that many things are going to require a lot of cores for me. That is because my FreeNAS's are storage only, no virtualization. I do that in other ESXi hosts. If I were the one choosing, I'd lean towards the 2643v3 for my needs. I you are doing some virtualization, the extra cores might have some value for you. All IMHO, off course.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Single-threaded per connection. Multiple clients will benefit from multiple cores, as will other exotic scenarios.
 
Joined
Dec 29, 2014
Messages
1,135
Single-threaded per connection. Multiple clients will benefit from multiple cores, as will other exotic scenarios.
Gotcha. This is my home/lab network. On the rare occasion that I have bulk transfers going, it is 1-2 hosts max. Use case definitely influences gear selection!
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Oh great. Well in that scenario the best value appears to be a stick of G.Skill 32GB 2666 MHz CL19. I'll probs get one of those to start with then buy another later on. Would that be fine?

Your board doesn't appear to support 32GB in a single stick - best you can do is 2x16GB, so for now buy a single stick of 16GB and run it alongside an 8GB.

Yeah, I intend to revert to SYNC & re-test. If it remains unimproved from my last results, I'll try that ZIL in another device or get rid of it.

sync will never be faster than async - safer, yes, but never outright faster.

Radian says the RMS-200's performance is...
  • Over 1.3 Mil. IOPS – 4K Random Writes
  • Over 5.5GB/s – 128K Random Writes
I think PCIe 3.0 can do ~1GB/sec per lane or 8GB in the x8 slot ...
But, I've also seen performance vary (probably background tasks running, etc)

Those numbers are probably to a raw device (with no filesystem) and with extremely high queue depths. You might see something approaching them if you just fire dd at the device with no limits on queue depth and a huge number of parallel jobs. SLOG, on the other hand, is a worst-case scenario for storage benchmarking in that it's pretty much "single thread, single queue depth"

Do record sizes affect IOPs or just the efficiency of the size of the data and the space consumed ..?

Both IOPS and efficiency. A large recordsize on ZFS and a small transaction size from the client (eg: default 128K records on an NFS datastore, and trying to do 4K/8K updates) means a lot of potential for records to need a read-modify-write cycle to be updated. The nature of ZFS to "batch up the changes" into a transaction group helps to offset this but it still needs to be flushed to the back-end vdevs. Smaller recordsizes lessen the impact of the RMW cycle but lose out on the space efficiency, compression (per-record only) and generate more metadata. It's a balancing act. The default of 16K works okay but I've seen tangible benefits on compression by going to 32K.

Ultimately you need to benchmark your workload, there's no easy or one-size-fits-all answer.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Your board doesn't appear to support 32GB in a single stick - best you can do is 2x16GB, so for now buy a single stick of 16GB and run it alongside an 8GB.
Bruh ugh. Never knew a nas would be such a money drain. Thank god I didn't buy a 32 GB stick, I completely forgot about checking the motherboard. I think I'm gonna get 16 GB sticks and if 32 GB still is too slow ill get a new motherboard and get 4 sticks then.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
You mentioned you're just using it as a torrent target and for some of your Premiere/Blender work. Is there a particular reason you went with iSCSI over SMB? It'll be easier to get performance out of SMB.
I also forgot to mention I have a lot of my games on it too. I have a lot of drives and stuff, but I wanted a big and proper solution that's futureproofed as my main non-boot drive. Hence I wanted iSCSI to be sure it would be compatible with all my current and future use cases.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
At some point, instead of juggling chainsaws while perched atop a ladder precariously balanced over a pit of lava just so that you can maybe, possibly, if the starts align use the share to run steam games off of; you should probably just buy a large, cheap SSD for your client and skip the suffering on the server side of things.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
At some point, instead of juggling chainsaws while perched atop a ladder precariously balanced over a pit of lava just so that you can maybe, possibly, if the starts align use the share to run steam games off of; you should probably just buy a large, cheap SSD for your client and skip the suffering on the server side of things.
No I actively use it for steam games and the whole point of this is to avoid SSD costs once I get it up and running properly.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Uhh.
1644025047757.png

Updated earlier today and ran a test. Same results. Now I run the test and consistently get this? Also my graphs work now, so ill drop this here if anyone recognises this behaviour.
Right after update:
DiskMark64_qJMLS8xUHI.png

chrome_qi27Lzm4MH.png
chrome_kS39OTEmtX.png

chrome_wYPsTMHwJG.png


1644025287767.png
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Could it be that the ARC size just increases until it fills all the ram? When I run the test now I get an ARC hit ratio of almost 100% and you can see it slowly increasing as well as being far more bumpy when I start my torrent
chrome_p3bCtZKO5o.png

chrome_Zk2LP1FOyj.png
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Could it be that the ARC size just increases until it fills all the ram? When I run the test now I get an ARC hit ratio of almost 100% and you can see it slowly increasing as well as being far more bumpy when I start my torrent

That's how ARC works, but you said:

Updated earlier today and ran a test.

Did you reboot the TrueNAS VM and then run the test right away? If you restarted, you'll have a whole lot of empty RAM looking for a viable target, and the freshly-created CrystalDiskMark test file will get put there, resulting in the amplified read speeds.
 
Top