Slow read speed on iSCSI

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
I got my first nas recently and I'm getting quite a lot slower read speeds than write speeds. First I bought a u/utp cable, getting 6 mb/s sequential read but 300-700 write. I connected it with a 1.5m cat6a cable included with the TX401 and got 200 read. Then I bought an S/FTP cable after someone said it might be interference (my ethernet cable is next to the extension cord. Now I get around 200 read but still, 700 write. Separating the cables so they aren't remotely close only gives a small improvement. Does anyone know what this could be?

Old cable (U/UTP):
unknown.png


New cable (S/FTP), separated from power cable:
newcableaway.png


New cable (S/FTP), tugged alongside power cable:
newcabletugged.png


My setup:
Direct 10 gbps connection on each end (TX401)
15m CAT6a rj45 S/FTP cable (my nas is on my cabinet)
Nas specs:
  • i3-7100
  • Crucial 2x8gb 2400 mhz CL16
  • 3x4TB Ironwolf
  • Asus prime B250M-K
 

Attachments

  • DiskMark64_WdZpEs8n0j.png
    DiskMark64_WdZpEs8n0j.png
    27.1 KB · Views: 152

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
I actually had similarly bizarre performance on my system ... I'd've expected read to outperforme write.

Reads were perhaps slower than I expected
Writes were perhaps faster than I expected

Tested with a 14K item library folder (to have more than synthetic tests) ...
(times shown in pictures were kept from the clock as time stamps)

SMB vs iSCSI Transfer Test.png


I also have some speed tests but have an appointment right now and will have to upload it later.
I thought iSCSI might provide better IOPs but people don't really mention it much?
(making this an awesome confirmation)
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Write is almost always faster than read, for the simple reason that you can firehose whatever you want at the NAS, and it writes it into the write cache ("your system RAM"), which can be very large depending on how much RAM you have. When the transaction groups fill, the write speeds drop to the actual speeds at which things can be flushed out to the pool.

Read speeds, however, are almost never "full pool speed" because this would require prescience on the NAS side of what you were going to read. ZFS does do some read-ahead, but it does not read-ahead an indefinite amount, so this is almost always slower than write. The time that reads are faster is when the data is already in ARC.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Write is almost always faster than read, for the simple reason that you can firehose whatever you want at the NAS, and it writes it into the write cache ("your system RAM"), which can be very large depending on how much RAM you have. When the transaction groups fill, the write speeds drop to the actual speeds at which things can be flushed out to the pool.

Read speeds, however, are almost never "full pool speed" because this would require prescience on the NAS side of what you were going to read. ZFS does do some read-ahead, but it does not read-ahead an indefinite amount, so this is almost always slower than write. The time that reads are faster is when the data is already in ARC.


I dig it but then why isn't it limited to the SFP+ interface / RAM speed / CPU to process whatever associated overhead exists..?

It's odd but - performance-discussions here usually assume the SMB protocol which leads people to cite IOPs as the performance constraint.
I've yet to read anyone point out the differences between SMB and iSCSI as it pertains to small files (IOPs)...which I was told is a function of the CPU's ability to "encode" (?) the transfer in to the SMB protocol (which I say loosely as I'm not sure which terms to use and obviously know SMB isn't like a File System) and perhaps other overhead associated with verifying the resource is free (others aren't using) etc.

(actually, I'd like to know what really regulates the performance of SMB to better identify which limits I may be able to influence) ...

It's unfortunate whatever limits SMB performance (if based on the CPU) ... hasn't been turned into a dedicated processing module.
Like GPUs, bitcoins (ASICS miners) / encryption processors / HEVC HW codec / etc ... which all earned their rights for dedicated CPU real estate for their dedicated tasks ... yet SMB which is likely one of the most ubiquitous CPU uses... yet after 25 years has still yet to achieve parity..

That said, I hope my tone doesn't seem accusatory or skeptical...
Which it's not (unless at some point Intel's chief architect reads this). :smile:

Mainly I'm grateful to have found iSCSI and think we need an interface which combines the best of both the SMB + iSCSI worlds.
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
why isn't it limited to the SFP+ interface / RAM speed / CPU to process whatever associated overhead exists..?

I could write you a long answer, but, quite frankly, I am very busy right now, and don't have the time.

I'm instead going to refer you to something that talks about latency, but also demonstrates layers.


Please go read that first post, and go down to the "Laaaaaaaatency" section and read that one whole section. Let's agree from the start that this isn't what happens for each block that is read, because it isn't, for the read path, but nevertheless it is important to think about read as being a problem that is related to this stack of activities.

You're looking at this from the point of view of "why isn't this going as fast as the weakest link in the chain", when in reality, it's quite a bit more like "read performance is the SUM of the weakness of all the links in the chain". And each of the chain links has different ways of trying to optimize. For example, the SATA protocol employs NCQ, which allows the controller to issue up to 32 requests towards a HDD at a time, but doesn't have the speculative readahead benefits of a hardware RAID controller, which may just decide to read in a lot more just for giggles.

I am hoping you have the "aha" moment followed by the "oh $#!+ how does this even manage to work as well as it does" moment. :smile:
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Write is almost always faster than read, for the simple reason that you can firehose whatever you want at the NAS, and it writes it into the write cache ("your system RAM"), which can be very large depending on how much RAM you have. When the transaction groups fill, the write speeds drop to the actual speeds at which things can be flushed out to the pool.

Read speeds, however, are almost never "full pool speed" because this would require prescience on the NAS side of what you were going to read. ZFS does do some read-ahead, but it does not read-ahead an indefinite amount, so this is almost always slower than write. The time that reads are faster is when the data is already in ARC.
How would you improve this? I just upgraded to 16 GB ram but often max out I think but it doesn't make much sense since the cable made such a significant impact. CPU usually only gets a few percent usage. Would getting more ram solve it or only increase the cache? Oh what about a ssd cache I heard that helps too or smth, idrk
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
How would you improve this?

Well, in YOUR case, I would get rid of the TX401, because the general experience is that crappy network cards tend to yield crappy performance with weird issues. You should follow the hardware recommendations in the 10 Gig Networking Primer for best results.


I just upgraded to 16 GB ram

I will also note, wryly, that 16GB is the minimum memory requirement, as outlined on the download page.


Additionally, we really don't recommend iSCSI with less than 64GB of RAM. Some of the reasons for that are described here:


Others are here:

 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
Synthetic Benchmarks via MacOS port of Crystal Mark (Amorphous Disk Mark)
10TB HGST TrueNAS config.png

(Yes, I know this isn't "pretty" ... I suck at art but it makes the point).

SMB v iSCSI (no sync) MBs + IOs.png
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
I could write you a long answer, but, quite frankly, I am very busy right now, and don't have the time.

I'm instead going to refer you to something that talks about latency, but also demonstrates layers.


Please go read that first post, and go down to the "Laaaaaaaatency" section and read that one whole section. Let's agree from the start that this isn't what happens for each block that is read, because it isn't, for the read path, but nevertheless it is important to think about read as being a problem that is related to this stack of activities.

You're looking at this from the point of view of "why isn't this going as fast as the weakest link in the chain", when in reality, it's quite a bit more like "read performance is the SUM of the weakness of all the links in the chain". And each of the chain links has different ways of trying to optimize. For example, the SATA protocol employs NCQ, which allows the controller to issue up to 32 requests towards a HDD at a time, but doesn't have the speculative readahead benefits of a hardware RAID controller, which may just decide to read in a lot more just for giggles.

I am hoping you have the "aha" moment followed by the "oh $#!+ how does this even manage to work as well as it does" moment. :smile:

Very clear writing (indicative of clear-thinking as well). Thank you. It's impressive how many mods / experts here ... have not only deep-knowledge, but also in a wide breadth of subjects. (I can only imagine the complexity of projects you've had to hit your head on a wall over).

Even before reading that suggestion, the distinction between aggregate-weaknesses vs weakest link already reframes my perspective.

Again, thanks.
 

TrumanHW

Contributor
Joined
Apr 17, 2018
Messages
197
How would you improve this? I just upgraded to 16 GB ram but often max out I think but it doesn't make much sense since the cable made such a significant impact. CPU usually only gets a few percent usage. Would getting more ram solve it or only increase the cache? Oh what about a ssd cache I heard that helps too or smth, idrk

It was hard for me to believe initially (and this pertained to SMB / NFS shares and we're talking iSCSI) ... but apparently the CPU has a greater demand that is assumed ... and I THINK the reporting of CPU utilization can be misleading also.

Which CPU do you use .. ?
I know I had a Proc which had no HT and slow clock speeds ... and upon upgrading it I did see improvements.

I have a 3rd unit that's almost identical to another 2 which I've yet to install the CPU upgrades in & - time permitting, will test & report back.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Thank you. It's impressive how many mods / experts here ..

We aim to please. That's what a community is for. Lots of people with a variety of experience. You wouldn't want to ask me about SMB. I might know less than you! Everyone has their skill set...
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Synthetic Benchmarks via MacOS port of Crystal Mark (Amorphous Disk Mark)
View attachment 52783
(Yes, I know this isn't "pretty" ... I suck at art but it makes the point).

View attachment 52788
What you're likely seeing here is a fine illustration of the trade-off between "throughput vs IOPS" that you get by tuning recordsize. Datasets default to 128K (which will be beneficial to the larger sequential I/O of the 1M tests) whereas zvols default to 16K (which are better for the small 4K tests)

Also, with sync=disabled that RMS-200 is doing absolutely nothing other than heating the room by 25W. I assume this setting was only used for the tests?
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Well, in YOUR case, I would get rid of the TX401, because the general experience is that crappy network cards tend to yield crappy performance with weird issues. You should follow the hardware recommendations in the 10 Gig Networking Primer for best results.




I will also note, wryly, that 16GB is the minimum memory requirement, as outlined on the download page.


Additionally, we really don't recommend iSCSI with less than 64GB of RAM. Some of the reasons for that are described here:


Others are here:

How crap is the tx401? I don't have unlimited money.
64gb ram????? What. Well if it uses that much ram it might be it
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
How crap is the tx401? I don't have unlimited money.
64gb ram????? What. Well if it uses that much ram it might be it

I don't know how crap the TX401 is. My boss doesn't have unlimited money either, so I am usually required to research stuff that works well before any purchase orders are signed. Speculative purchases of unproven hardware are relatively rare. However, I will note that the TX401 is basically the Aquantia junk, which is basically to 10G what Realtek or Marvell is to 1G. It isn't supposed to be tragically unusable, but it is aimed at the PC market, and developers have called its driver poor quality in the past (I think the words may have been "development quality"). This card is probably fine for a client PC where you just want to do a little web browsing or faster-than-1Gbit file transfer, but given that the driver is only about two years old and even its authors don't seem to love it, and it is really intended to be a "cheap" solution for PC clients supporting PCIe x4 and RJ45 copper, it is easy to see that this thing doesn't hold a candle to a used Solarflare SFN6122 that costs a third of the money, or a used Intel X520/710 that costs about the same, sports fiber, and has drivers written and tuned for years by a dedicated driver team at their respective companies. So from my perspective, I could have spent the same money you did and received something known to work very well.

As for memory, iSCSI is a weird thing, and as @HoneyBadger notes, it isn't really an apples-to-apples comparison.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
I don't know how crap the TX401 is. My boss doesn't have unlimited money either, so I am usually required to research stuff that works well before any purchase orders are signed. Speculative purchases of unproven hardware are relatively rare. However, I will note that the TX401 is basically the Aquantia junk, which is basically to 10G what Realtek or Marvell is to 1G. It isn't supposed to be tragically unusable, but it is aimed at the PC market, and developers have called its driver poor quality in the past (I think the words may have been "development quality"). This card is probably fine for a client PC where you just want to do a little web browsing or faster-than-1Gbit file transfer, but given that the driver is only about two years old and even its authors don't seem to love it, and it is really intended to be a "cheap" solution for PC clients supporting PCIe x4 and RJ45 copper, it is easy to see that this thing doesn't hold a candle to a used Solarflare SFN6122 that costs a third of the money, or a used Intel X520/710 that costs about the same, sports fiber, and has drivers written and tuned for years by a dedicated driver team at their respective companies. So from my perspective, I could have spent the same money you did and received something known to work very well.

As for memory, iSCSI is a weird thing, and as @HoneyBadger notes, it isn't really an apples-to-apples comparison.
I see. Well, I didn't really wanna have to get SFP+ and get those converters and stuff. It's only 15 meters away, up on my cabinet. I'm the only one using it for some torrenting most of the time but also things like premiere pro, blender, and other heavier tasks at times. TX401 would be fine then right? I kinda figured it would be ram since this is what the dashboard looks like most of the time
1643885730530.png

But I just wanted to make extra sure I really need more ram. Could getting a disk cache help with this? I currently have a 120 gb apple as my boot but I could maybe get a sata ssd for the OS then use the nvme as cache to solve the ram limitations?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Aquantia junk, which is basically to 10G
I wouldn't put it in the Realtek trash bin. Marvell, sure, but they're not Realtek bad, from what I hear. It's not like Realtek is getting any better either, the RTL8125 seems to be every bit as crappy as the RTL8111 series. And don't get me started on their RTL8153 GbE USB chipset, that thing has cost me way too many hours of my life.
Sidenote: the fact that Realtek calls their NIC lineup a "zoo" should really be a hint that little good is going to come out of that:
zoo/category/network-interface-controllers-10-100-1000m-gigabit-ethernet-usb-3-0-software

could maybe get a sata ssd for the OS then use the nvme as cache to solve the ram limitations?
Not with so little RAM. Maybe if you had 64 GB of RAM, but to use L2ARC you need to keep its pointers in ARC, consuming DRAM.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Realtek trash bin. Marvell, sure, but they're not Realtek bad,

The card targets PC end users, tries to be a "cheap" card (fsvo "cheap"), has a driver not even its authors respect, and has been repeatedly in these forums for various problems or issues with performance.

Okay, UNLIKE Realtek, they were non-garbage enough to get bought out (by Marvell), but then again, they were a founding member of the NBASE-T Alliance. Optimists will say "that put their name alongside Cisco and Xilinx". Pessimists will say "it is a massive grift on the networking world to have this stupid 2.5/5G technology".

Complicated call. Possibly an argument that they are not Realtek bad, but worse. I know that's not quite what you meant. :smile:
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
As for memory, iSCSI is a weird thing, and as @HoneyBadger notes, it isn't really an apples-to-apples comparison.

That's a note more to the benchmarks posted by @TrumanHW mid-thread.

For the OP @Askejm - I assume they're using SMB - in which case the question of "how do I go faster" is probably answered by "more disks in a faster configuration" - my assumption with 3x4TB drives and a disk of 6656GB "as seen by Windows" is that this is a RAIDZ1.
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
Not with so little RAM. Maybe if you had 64 GB of RAM, but to use L2ARC you need to keep its pointers in ARC, consuming DRAM.
Hmmm. Would 64GB be a minimum then? Or would 32 be enough? Also, will speed be a major factor? I would want to cut corners wherever I can get away with it, I don't have a lot of money and this is much of a budget nas, so what speed would be the best value? (specs in original post)
 

Askejm

Dabbler
Joined
Feb 2, 2022
Messages
34
For the OP @Askejm - I assume they're using SMB - in which case the question of "how do I go faster" is probably answered by "more disks in a faster configuration" - my assumption with 3x4TB drives and a disk of 6656GB "as seen by Windows" is that this is a RAIDZ1.
I'm using iSCSI as stated in the original post. Yes it is RAIDZ1. And while I don't how well openzfs scales, I'd assume probs around 500-600 MB/s which lines fairly close up with the write speed. The results in the original post were also set to be averaged over 9 tests
 
Top