Slow iSCSI Read Performance

mrstevemosher

Dabbler
Joined
Dec 21, 2020
Messages
49
That's VM performance and VM performance is fine. Everything runs and starts really fast. It's the vMotioning and such that I have problems with. That would also probably explain why backups are slow as well. It's mounting the vcenter snapshot.

Almost identical issues as us. Writes about 250MB then stops ... Grabs 30MB then stops ...

We'll keep watching this thread for assistance.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Your disks just seem to be sitting idle during heavy reads. They're not delivering a lot of bandwidth each but they aren't taking a lot of time to do it either; if it was a case of heavy fragmentation I'd expect higher ms/read latencies.

What does the ARC hitrate look like during an svMotion operation? I know that prefetching has big impacts on svMotion under NFS because it's able to accurately pick up on the read pattern from the .vmdk file, but iSCSI is block-level so it's more scattered.

What HBA and drives are being used in the host?
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Your disks just seem to be sitting idle during heavy reads. They're not delivering a lot of bandwidth each but they aren't taking a lot of time to do it either; if it was a case of heavy fragmentation I'd expect higher ms/read latencies.

What does the ARC hitrate look like during an svMotion operation? I know that prefetching has big impacts on svMotion under NFS because it's able to accurately pick up on the read pattern from the .vmdk file, but iSCSI is block-level so it's more scattered.

What HBA and drives are being used in the host?

Attached are screenshots of svMotion from iSCSI SAN to local M.2. I am svMotioning my vCenter and it is about 100GB right now. The first screenshots are from iSCSI mirror pool to local M.2.

I then performed a test of svMotioning from local M.2 to test samsung 850 pro ssd via ISCSI. This was weird because it transferred at like 50Mb/sec to this test iscsi share. Not sure if this pro is a failing drive or what. (This 850 pro used to be my l2arc). I then svMotioned back to local M.2 and transfers were about 150-300MB/sec.

I then svMotioned the vCenter back to the standard ISCSI SAN mirror pool and transfers were anywhere from 300-700 MB/sec.

As for drives, I know you are instantly going to say thats my problem and I am sure it is contributing to it somewhat, but I don't think its the full issue.
I have a mix of drives.
I have a Seagate Barracuda - ST2000DM006-2DM164
A couple of wd greens - WD20EARS
And some other Barracudas - ST2000DM001-9YN164

The drives are older drives and I have plans to replace them in the future with shucked 8tb elements. Currently using these drives because they were all free. I have replaced every drive showing starting signs of failures. Have had no problems resilvering. I was initially running is 2x6 Raidz2s, but switched to mirror for trying to get extra performance. I didn't really get much.
 

Attachments

  • iSCSI to local m.2 arc hit.png
    iSCSI to local m.2 arc hit.png
    571.2 KB · Views: 274
  • iSCSI to local m.2 arc prefetch.png
    iSCSI to local m.2 arc prefetch.png
    302.1 KB · Views: 270
  • iSCSI to local m.2 network traffic.png
    iSCSI to local m.2 network traffic.png
    211.6 KB · Views: 264

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The ARC hitrate diving under svMotion is expected, because it's pulling data that isn't touched normally. The prefetch algorithm almost seems to take one crack at it and then almost give up, as if it says "I'm not helping here" and shuts down. However, more RAM isn't an option here I assume looking at the hardware.

As for drives, I know you are instantly going to say thats my problem and I am sure it is contributing to it somewhat, but I don't think its the full issue.

The drives are a bit of a mixed bag - the Seagates are fine, the WD Greens maybe less so - but as long as the Greens aren't being given enough time to go idle and park their heads it should be okay. Nothing is SMR which is the main bullet point to avoid. Mixing 5400 and 7200rpm will cause the effect of "riding the brakes" a bit for the faster spindles, but if there's no drives or vdevs that are actually giving errors and stalling out it should be okay.

The only thing I can think of is that your data is highly fragmented. ZFS only shows free space fragmentation unfortunately in the zpool list FRAG column.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
The ARC hitrate diving under svMotion is expected, because it's pulling data that isn't touched normally. The prefetch algorithm almost seems to take one crack at it and then almost give up, as if it says "I'm not helping here" and shuts down. However, more RAM isn't an option here I assume looking at the hardware.



The drives are a bit of a mixed bag - the Seagates are fine, the WD Greens maybe less so - but as long as the Greens aren't being given enough time to go idle and park their heads it should be okay. Nothing is SMR which is the main bullet point to avoid. Mixing 5400 and 7200rpm will cause the effect of "riding the brakes" a bit for the faster spindles, but if there's no drives or vdevs that are actually giving errors and stalling out it should be okay.

The only thing I can think of is that your data is highly fragmented. ZFS only shows free space fragmentation unfortunately in the zpool list FRAG column.

I forgot to mention that I am using a 9200-8i in IT mode with a SAS expander card.

SSD test is direct attached through local sata ports.

zpool list: iSCSI-Pool 10.9T 4.48T 6.40T - - 20% 41% 1.00x ONLINE /mnt

Any thoughts on how I might fix the fragmentation to see if that will help?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
The only way to correct the fragmentation is to migrate the data off and then back on again, hoping that it's written sequentially. Having it be in VMDKs complicates the matter further because if there's fragmentation of the filesystem inside the VMDK (eg: at the NTFS level) then it could copy them "in-order" from the VM perspective but still have to read them "out of order" in the guest.

There's no easy fix to this unfortunately, only throwing massive amounts of ARC and L2ARC at it or having your vdevs themselves be faster (eg: flash) to be able to mitigate the impact of the random I/O.

Thought here - if you do a local test just dd'ing a zvol device into /dev/null (bs=64K should match svMotion granularity) what kind of speeds do you see there?
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
The only way to correct the fragmentation is to migrate the data off and then back on again, hoping that it's written sequentially. Having it be in VMDKs complicates the matter further because if there's fragmentation of the filesystem inside the VMDK (eg: at the NTFS level) then it could copy them "in-order" from the VM perspective but still have to read them "out of order" in the guest.

There's no easy fix to this unfortunately, only throwing massive amounts of ARC and L2ARC at it or having your vdevs themselves be faster (eg: flash) to be able to mitigate the impact of the random I/O.

Thought here - if you do a local test just dd'ing a zvol device into /dev/null (bs=64K should match svMotion granularity) what kind of speeds do you see there?

If I perform that test on my live zvol, will it do anything to my data currently on there? Or should I create a new test zvol?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
If I perform that test on my live zvol, will it do anything to my data currently on there? Or should I create a new test zvol?
You're just reading the contents so it shouldn't have any ill effect. Maybe limit the count parameter if you're worried about it potentially hogging I/O from your VMs, and be ready to Ctrl-C it.

(Just don't reverse the if and of parameters and accidentally dd /dev/zero onto the zvol)
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
You're just reading the contents so it shouldn't have any ill effect. Maybe limit the count parameter if you're worried about it potentially hogging I/O from your VMs, and be ready to Ctrl-C it.

(Just don't reverse the if and of parameters and accidentally dd /dev/zero onto the zvol)

Kind of a beginner, can you give me the full command to run?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Kind of a beginner, can you give me the full command to run?

You'll have to find your zvol name from ls /dev/zvol which should show you all of them, then use the command below:

dd if=/dev/zvol/name_goes_here of=/dev/null bs=64K count=1M status=progress

This will copy the data in 64K chunks (which is the general size svMotion will use) with a count of 1 million chunks, reading a total of 64K*1M=64G of data and dumping it into the /dev/null void while reporting the progress along the way and finally at the end. If some of that first 64G is in ARC it will come from there, otherwise it will pull from the disks. To pull a different amount you can change the count parameter, or if you want to read the whole thing (!) remove it entirely.

Keep an eye on your overall ARC hit rate as well as ensure that your VM performance doesn't crater. If it does, cancel the command with Ctrl-C.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
You'll have to find your zvol name from ls /dev/zvol which should show you all of them, then use the command below:

dd if=/dev/zvol/name_goes_here of=/dev/null bs=64K count=1M status=progress

This will copy the data in 64K chunks (which is the general size svMotion will use) with a count of 1 million chunks, reading a total of 64K*1M=64G of data and dumping it into the /dev/null void while reporting the progress along the way and finally at the end. If some of that first 64G is in ARC it will come from there, otherwise it will pull from the disks. To pull a different amount you can change the count parameter, or if you want to read the whole thing (!) remove it entirely.

Keep an eye on your overall ARC hit rate as well as ensure that your VM performance doesn't crater. If it does, cancel the command with Ctrl-C.

When trying to run it on the test zvol I have setup on a different poo, I get an error saying the zvol is a directory. Should I be specifying a specific filename or anything?

Also, when doing a vMotion, I don't really experience any VM degradation.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
When trying to run it on the test zvol I have setup on a different poo, I get an error saying the zvol is a directory. Should I be specifying a specific filename or anything?

The first level is a directory yes, then the zvol is nested inside the directory of your pool name - for example dd if=/dev/zvol/PoolName/ZvolName of=/dev/null bs=64K

Also, when doing a vMotion, I don't really experience any VM degradation.

Understandable, but if the local dd job decides to run at a much faster pace (indicating there's possibly a network or VMware configuration issue in play) then it might pummel your disks hard enough that any ARC misses will suffer very slow times. Run a few separate terminal sessions to monitor things like htop and gstat -dp to see how busy your disks get and how high the ms/read times are.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
The first level is a directory yes, then the zvol is nested inside the directory of your pool name - for example dd if=/dev/zvol/PoolName/ZvolName of=/dev/null bs=64K



Understandable, but if the local dd job decides to run at a much faster pace (indicating there's possibly a network or VMware configuration issue in play) then it might pummel your disks hard enough that any ARC misses will suffer very slow times. Run a few separate terminal sessions to monitor things like htop and gstat -dp to see how busy your disks get and how high the ms/read times are.

Lol, not sure why I am having so much trouble with this. SSDTest is pool name for test and ssd-zvol is zvol name. When I do the below command, I get the error of: dd: /dev/zvol/SSDTest/SSD-zvol/: Not a directory

dd if=/dev/zvol/SSDTest/SSD-zvol/ of=/dev/null bs=64K count=1M status=progress
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Drop the trailing slash in your if parameter.

dd if=/dev/zvol/SSDTest/SSD-zvol of=/dev/null bs=64K count=1M status=progress

I ran what you suggested and it seemed to run ok. I didn't notice any performance issues. It started off slow at about 35-50 Mb/sec, but sped up. Disk busy didn't go above 30% or anything like that. htop didn't really show anything out of the ordinary as well. Some spikes on some cores, but no major usage.

1614887653967.png
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Your disk is able to deliver decent speed locally, but it's choking remotely.

Are you familiar with esxtop on vSphere/ESXi? I'd start there with the disk panels (start esxtop from an SSH session, then flip to the disk panels with hotkey v, d, or u for disk stats by VM/device/adaptor.
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Not exactly sure what I am looking for, but here are some screenshots when I am doing an svMotion.

1614906247442.png


1614906283897.png
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Looks like you're pulling in about 70MB/s of svMotion speed there. DAVG/cmd (driver average latency) values of 26.77ms are pretty rough though. What's the network configuration/pathing for the ESXi host to TrueNAS?

Network configuration is each host has one of those 10GB cards. Cards are two ports each. One port on each host is for iSCSI SAN connection. Other port is SFP and used for internet. Hosting router on my hosts. Each port connected to iSCSI network goes into a CRS305-1G-4S+IN switch. SAN goes single port into the switch. Yeah, I know I should have multipath. Maybe in the future. Jumbo frames are enabled

As a test (and something I don't understand as this is the first time I have ever setup a physical SAN using iSCSI), I noticed ESXi could enable iSCSI on the physical adapter. Screenshot below. I was originally configuring iSCSI through software iSCSI. I turned off software iSCSI and configured directly on the port itself and performance got so much worse. Other screenshots below. I figured hardware iSCSI would be better than software. With this config, I am getting like 5-10MB/sec and it is abysmal. Transfer back to the SAN is still good. But svMotion off is terrible right now.
1614993069513.png

1614993133464.png


I previously had iSCSI configured through the software adapter with no port binding configured. I am probably going to go back to software as hardware is incredibly slow right now.

1614993435697.png
 

clifford64

Explorer
Joined
Aug 18, 2019
Messages
87
Looks like you're pulling in about 70MB/s of svMotion speed there. DAVG/cmd (driver average latency) values of 26.77ms are pretty rough though. What's the network configuration/pathing for the ESXi host to TrueNAS?

I am also performing another of that DD test and I am only getting 100MB/sec speeds and onyl 30% disk usage.
 
Top