Bad drive speeds, not sure what I'm missing

Status
Not open for further replies.

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
Specs;
Dell T110 ii
Xeon E3-1220v2
16GBs RAM
4xHitachi a7k2000

I have the system setup in what is basically a raid 10 I believe (made a mirror, then extended that mirror and with another mirror)

These disks were being used as an iscsi target before in the same system, but I was running them in a Raid 5 through the raid controller, with windows as the host OS and then FreeNAS in a VM. I decided to move over to just FreeNAS right on bare hardware after testing it for awhile and liking it.

My drive speeds have been pretty absymal since the move. I have set the BIOS to just pass the drives through to the OS (AHCI mode) so the raid controller is no longer in the picture. I am getting around 40MB/s read and write in this configuration (as seen from the freenas monitoring tab), where as before I was getting 90MB/s even with R5 performance hits. Sometimes it will run faster for maybe a minute or two, at which point is maxes out the 1 Gig connection (around 110MB/s so pretty great). It look like a plateau of speed in the freenas monitoring but doesnt last long.

I've spent a lot of time google around the forums last night but couldnt find any answers. My system appears to be spec'd correctly from what I can see. The CPU is not tapping out at all, I have 8 Gigs of ram, and then another 1 Gig per TB on top of that.

The weirdest part to me is that, one day, I'm running it in a terrible setup (Raid controller, as a vm, etc) and the next I'm running it in the 'correct' way with a performance oriented Raid style and its far worse. I am guessing I am just missing something somewhere but I dont know what. Any help would be hugely appreciated. Thanks!
 

kdragon75

Wizard
Joined
Aug 7, 2016
Messages
2,457
What are you using as an HBA, how are you testing, what kind of network card do you have?
 

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
I have the server and nas directly attached using both their onboard NICs. No switch in the middle. I am using the onboard for the T110 ii and have a x399 based server using its onboard.

Same setup as I had before when Freenas was virtualized

I am currently testing using my iSCSI setup. Moving large 5-20 gig files to the disks.
 

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
Ok, so I just did a test with SMB and that pegs the network at pretty much full line rate (113MB/s). So there may be some changes I need to make with iSCSI in particular. I am going to let this test run awhile longer and make sure that it stays at that rate then try to look into iSCSI options. If anyone has any ideas let me know. Thanks!
 

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
After doing more reading about ZFS and iSCSI it sounds like ZFS isnt very good for it. So I will plan to do UFS instead and see if that solves my performance issues. If it doesnt I may just serve the iSCSI target from another OS
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
After doing more reading about ZFS and iSCSI it sounds like ZFS isnt very good for it. So I will plan to do UFS instead and see if that solves my performance issues. If it doesnt I may just serve the iSCSI target from another OS
You can't do UFS with FreeNAS.
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
iSCSI is a special case in FreeNAS. It pretty much requires that you have a SLOG (Separate LOG) device...

Testing the benefits of SLOG
https://forums.freenas.org/index.php?threads/testing-the-benefits-of-slog-using-a-ram-disk.56561


Or that you set sync=disabled if sync writes are what is slowing everything down. Which no one would advise in a production system. Unless one knows the risks and accepts those risks.

For testing, before getting an SLOG, the OP could set sync=disabled on the iscsi dataset to see if that improves the performance (it likely will). If that setting speeds things up then so will a proper SLOG. :)
 

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
Or that you set sync=disabled if sync writes are what is slowing everything down. Which no one would advise in a production system. Unless one knows the risks and accepts those risks.

For testing, before getting an SLOG, the OP could set sync=disabled on the iscsi dataset to see if that improves the performance (it likely will). If that setting speeds things up then so will a proper SLOG. :)

Ha too late. Bought a 16GB nvme for slog (it was only $50 for a pci e to nvme card and the disk so whatever)

It helped a great deal, but I am still seeing some drops in performance. Its the same pattern as I was seeing before but the drops are WAY less severe now (ignore the huge drop, I paused a transfer there):

upload_2018-4-22_16-26-4.png

upload_2018-4-22_16-27-57.png


I am thinking maybe I just need to go up to 24 or 32 gigs of ram at this point? Any ideas what it means that the slog helped a ton but didnt totally fix the issue? Before I the slog I was seeing the lows around 20-30MB/s instead of 40MB/s
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I don't know how much reading you have done on this, but the standard guidance is to max out RAM before adding SLOG or L2ARC because the caching that is done in RAM adds more performance. Additionally, the speed with which the SLOG can respond (latency) is critical to how well the system performs.
Here is a video that I show to help illustrate the difference in performance. This is about L2ARC, but it is similarly applicable to SLOG even though SLOG is not actually a cache.
https://www.youtube.com/watch?v=oDbGj4YJXDw&t
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
Yes, but maxing out RAM does zilch for sync writes. So if sync writes is the problem, SLOG is (at least part of) the answer. Fastest SLOG would be NVDIMM with some storage driver. The rest depends on the latency in HW and the SW stack (whatever that ends up being for the chosen SLOG).

Were I the OP I would still want to see the system performance with sync=disabled because that is going to be your high performance mark (all else being equal) assuming sync writes are what is slowing you down. If that's not what's slowing you down, then yes, by all means invest in as much system RAM as you can afford. (But test where your diminishing returns are for your workload.)
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
Also, what is the average write throughput of the pool with the network out of the equation? That may still be a bottleneck. (Doubtful, but has it been tested?) You mentioned the pool was a striped pair of mirrors. Ok, so on your graph half the data is going the one mirror, half to the other. Each half looks to be 40 MB/s on the low end, 60 MB/s on the high end. So 80-120 MB/s for the pool.

That matches what would be flushed from RAM to the pool as the SLOG is showing 80 MB/s on the low end and 115 MB/s on the high end (which is what you would expect for a GB network, and matches what you saw with SMB). So are ALL the writes SYNC writes? It appears that is the case.

What happens when you do some dd testing on the pool (compression off, large block sizes, 5x or more your system ram size for the transfers)? What sustained write and read performance are you getting?

-------------------------
As an aside, you can play with vfs.zfs.txg.timeout a bit to help "smooth out" writes (in some cases). It may (will) lower overall throughput, but some folks have seen it stop the dramatic pause or temp freeze in a transfer. I think Freenas now has the default for that tunable at "5". Set it to "1" and see how results compare. (And technically you should destroy the pool between tests so you are not leaving a fragmented pool which may artificially slow performance. But if you are under 10% utilized I wouldn't worry about it.)
 

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
After the dip between 10:50 and 11 is where i turned sync off. Still not maxing my network connection. I can borrow ram from work so I am going to plan to do that and see how it acts with 32 gigs. Ill also do a dd test on the pool and see what I get

upload_2018-4-22_17-56-33.png
 

Stopsign002

Cadet
Joined
Apr 22, 2018
Messages
8
Double check my math, but about 139MB/s on this test:

upload_2018-4-22_18-10-1.png


Seems appropriate for these drives in this configuration
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
Interesting, I would have expected a bit higher with an empty pool (with no fragmentation) actually. Maybe 200 MB/s or so for the stripe. Either way, 140 MB/s is above the network speed.

I will be interested to see if adding RAM does anything. If sync writes are the bottleneck I don't think the RAM will help. But it will certainly help reads in a bit way! Esp on random I/O.

What is your iscsi client by the way? Just curious.

-----------------

I played with iscsi for a while as my freenas box is the shared storage for my ESXi systems. I never could get the performance I wanted out of it. Even using multipath to (in theory) eliminate the network as a bottleneck. Never could get above 1 Gbps line speed even inside the same ESXi host that has the freenas VM (which serves to that host - i.e. the network is all virtual inside that host). NFS got me to 3 Gbps (with sync=disabled in a home lab with UPS and nightly backups).
 
Status
Not open for further replies.
Top