Less than ideal performance over iSCSI

Status
Not open for further replies.

Chris LaDuke

Dabbler
Joined
Jan 17, 2014
Messages
18
Hey all. I am a relative n00b when it comes to zfs. I have a Supermicro 6025b-3 server with 12 GB RAM, 2 dual intel Gb nics, 6 300 GB 15k SAS drives, and two Intel DC S3700's. I am running it as a datastore for ESXi 5.1. I am connecting via iSCSI using mpio. I have the 6 SAS drives in 3 mirrors and I added the two Intel's as an SLOG. Each of the network interfaces are on their own vlan.

I am performing my test within a VM on an esxi host that also has two dual intel gb nics, each port on a corresponding VLAN. I am attaching a network diagram. My test consists of iometer 64K bs sequential write.

My iops are consistently approximately 2600 and my transfer rate is 168MB/sec. When I look at the individual network interfaces each is getting about 350 mbit/sec. What I have tested so far:

Disabled MPIO and tested individual network interfaces - I see approx 950 Mbit/sec on each of the individual interfaces. They each push about 116MB/sec and about 1870 IOPS each. There were no standouts of poor performance.

I then created a new pool of the two SSD drives in a stripe (gstat indicated the SAS drives were only 40% utilized at the peak, so I didnt think it was spool but wanted to check it off the list). I wanted to see if maybe I was hitting a drive bottleneck. Saw slightly higher IOPS, 2750, but the same transfer rate, 169 MB/sec.

I then did the test from a different esxi host, thinking maybe it was host specific. Same results. It should be said the hosts are identical IBM X3650's with dual E5345's, so if its a problem specific to that host platform, that doesn't really rule anything out. I also increased the vcpus on one of the vm's to four, to see if it wasn't a CPU issue on the esxi side (though Windows only reported 20% CPU utilization). While the test is running I see about 50% total CPU utilization on the FreeNAS server.

I have also attached iozone results, a snapshot of my pool and a snapshot of gstat when it peaks.

Somewhere I am hitting this 116MB/sec bottleneck, and I am stumped how to better isolate the problem. Or maybe, ask Jack Nicholson asked "What if this as good as it gets?"

Any advice is appreciated

Chris
 

Attachments

  • freenasconfig.pdf
    25.3 KB · Views: 328
  • IOResults.txt
    7.1 KB · Views: 199

Got2GoLV

Dabbler
Joined
Jun 2, 2011
Messages
26
That sounds about right... 116MB/s on GigE connects.

MPIO is not link aggregation.
So, you will not get more bandwidth than the equivalent of 1 of the MPIO links, per stream.

MPIO does much better at saturating all links in systems with many VMs, where IO's are distributed between each MPIO link (if Round Robin is configured).

But, no single stream will saturate more than the equivalent of 1 MPIO link.
 

daimi

Dabbler
Joined
Nov 30, 2013
Messages
26
Hey all. I am a relative n00b when it comes to zfs. I have a Supermicro 6025b-3 server with 12 GB RAM, 2 dual intel Gb nics, 6 300 GB 15k SAS drives, and two Intel DC S3700's. I am running it as a datastore for ESXi 5.1. I am connecting via iSCSI using mpio. I have the 6 SAS drives in 3 mirrors and I added the two Intel's as an SLOG. Each of the network interfaces are on their own vlan.

I am performing my test within a VM on an esxi host that also has two dual intel gb nics, each port on a corresponding VLAN. I am attaching a network diagram. My test consists of iometer 64K bs sequential write.

My iops are consistently approximately 2600 and my transfer rate is 168MB/sec. When I look at the individual network interfaces each is getting about 350 mbit/sec. What I have tested so far:

Disabled MPIO and tested individual network interfaces - I see approx 950 Mbit/sec on each of the individual interfaces. They each push about 116MB/sec and about 1870 IOPS each. There were no standouts of poor performance.

I then created a new pool of the two SSD drives in a stripe (gstat indicated the SAS drives were only 40% utilized at the peak, so I didnt think it was spool but wanted to check it off the list). I wanted to see if maybe I was hitting a drive bottleneck. Saw slightly higher IOPS, 2750, but the same transfer rate, 169 MB/sec.

I then did the test from a different esxi host, thinking maybe it was host specific. Same results. It should be said the hosts are identical IBM X3650's with dual E5345's, so if its a problem specific to that host platform, that doesn't really rule anything out. I also increased the vcpus on one of the vm's to four, to see if it wasn't a CPU issue on the esxi side (though Windows only reported 20% CPU utilization). While the test is running I see about 50% total CPU utilization on the FreeNAS server.

I have also attached iozone results, a snapshot of my pool and a snapshot of gstat when it peaks.

Somewhere I am hitting this 116MB/sec bottleneck, and I am stumped how to better isolate the problem. Or maybe, ask Jack Nicholson asked "What if this as good as it gets?"

Any advice is appreciated

Chris

Hi Chris,
Could you perform the same test LOCALLY WITHIN FreeNAS with "SYNC=ALWAYS" for the setup below:
1) 15k SAS HDD as zpool with S3700 as SLOG
2) S3700 as zpool only

And post their transfer rate and IOPS.

Because I am planning to setup all-in-one (freenas, esxi) server. That way I can avoid network latency. And I want to know if setup (2) would perform a lot better than (1).

Thanks in advance
 

Chris LaDuke

Dabbler
Joined
Jan 17, 2014
Messages
18
Daimi: Ok, granted I haven't had any coffee yet this morning, as I staged the test environment to provide I realized I had no idea how to do what you are asking. I don't see how I could test locally within FreeNAS. As FreeNAS is installed on bare metal, I don't see how its possible to run a VM within it? Like, even if I stored the vmdk files on that datastore, I don't think it would matter. The IO is being produced by the VM sitting on my esxi host and being written to the FreeNAS datastore. I am unclear how the bits for the test could be generated by the physical processor on the FreeNAS server the way you describe. Unless I were running a Freenas within an esxi host as a VM, which I am not doing. Please let me know if you see another way at this?
 

daimi

Dabbler
Joined
Nov 30, 2013
Messages
26
Daimi: Ok, granted I haven't had any coffee yet this morning, as I staged the test environment to provide I realized I had no idea how to do what you are asking. I don't see how I could test locally within FreeNAS. As FreeNAS is installed on bare metal, I don't see how its possible to run a VM within it? Like, even if I stored the vmdk files on that datastore, I don't think it would matter. The IO is being produced by the VM sitting on my esxi host and being written to the FreeNAS datastore. I am unclear how the bits for the test could be generated by the physical processor on the FreeNAS server the way you describe. Unless I were running a Freenas within an esxi host as a VM, which I am not doing. Please let me know if you see another way at this?


Hi Chris,
Thanks for your response.

Is it possible if you can run dd & iozone command below WITHIN FreeNAS
(DD test) time dd if=/dev/zero of=dd_output bs=1M count=30720
(IO test) iozone -s100M -r4K -I -t32 -T -i0 -i2

to your LOCAL ZFS POOL (with "SYNC=ALWAYS") created under below setup:
(A) 15k SAS HDD as zpool with S3700 as SLOG
(B) S3700 as zpool only

Please advise if I should spend $600 (please don't laugh at this budget as it might seem small for you... lol) on the slog (A) or zpool (B)
(A) Use my existing WD Black 500GB HDD and add a fast SLOG device such as
- $249 : Intel S3700 100GB (MLC SSD)
- $499 : Intel S3700 200GB (MLC SSD)
- Euro$ 129 : Winkom Powerdrive SLX-8 32GB (SLC SSD)
- Euro$ 189 : Winkom Powerdrive SLX-8 64GB (SLC SSD)
- Euro$ 289 : Winkom Powerdrive SLX-8 128GB (SLC SSD)

(B) Replace the HDD with a good SSD such as
- $604 : Intel DC S3500 Series 480GB MLC SSD (endurance : 275TB)
- $589 : Seagate 600 Pro Series 400GB 2.5" MLC SSD (endurance : 1080TB)
- $? : Samsung SM843 (endurance : 1 PB)
- $? : Samsung SM843T (endurance :2.4PB)


If you are interested, I have done some test on my side.....
-- My Setup --
(A) WD Black 7.2k SATA 500GB HDD as zpool with Kingston SSDNow V300 (3GB partitioned out of 60GB) as SLOG
(B) Kingston SSDNow V300 (35GB partitioned out of 60GB) without SLOG
-- DD test--
(A) 4:56
(B) 4:29
-- IO test--
(A) see attached "HDD_with_SLOG.txt"
(B) see attached "SSD_no_SLOG.txt"
 

Attachments

  • HDD_with_SLOG.txt
    2.5 KB · Views: 197
  • SSD_no_SLOG.txt
    2.5 KB · Views: 197

Chris LaDuke

Dabbler
Joined
Jan 17, 2014
Messages
18
OK. I have attached the results. Given the significant difference between our platforms, I am not sure how useful you will find it. Why don't we back up....what are you trying to accomplish? I mean, I know you want to go fast...but why? Do you only have 1 SSD and 1 WD Black? What are you using it for? Oh, and I have read that using a partition on an ssd as a ZIl is a no-no and hurts performance. From what I have read, you should use the whole drive.
 

Attachments

  • ssdtest.txt
    5.9 KB · Views: 176

daimi

Dabbler
Joined
Nov 30, 2013
Messages
26
OK. I have attached the results. Given the significant difference between our platforms, I am not sure how useful you will find it. Why don't we back up....what are you trying to accomplish? I mean, I know you want to go fast...but why? Do you only have 1 SSD and 1 WD Black? What are you using it for? Oh, and I have read that using a partition on an ssd as a ZIl is a no-no and hurts performance. From what I have read, you should use the whole drive.


Thank you very much for your data.
For test (B), your test took 214 sec and mine took 269 sec. Based on this I am not going to spend my money on pure SSD zpool
For test (A), your test took 60 sec and mine took 296 sec. Together with your io test data, I think I will focus my money on SSD for SLOG device

The reason I partition SSD as a SLOG device can be found in this post:
http://forums.freenas.org/threads/how-to-partition-zil-ssd-drive-to-underprovision.11824/#post-54426

And I have gone through a bit of reading and came up with the following calculation for SLOG device:
300MB/s transfer rate (2 x HDD each can write 150MB/s)
v
1.5GB single txg group size (300MB/s x 5 sec where txg by default will flush data every 5 sec)
v
12GB system memory (1.5GB / (1/8) where txg by default use up to 1/8 system's memory)
v
6GB* partitioned SSD for ZIL/slog device (The maximum size of a log device should be approximately 1/2 the size of physical memory because that is the maximum amount of potential in-play data that can be stored)
V
Leave the rest of SSD unallocated to improve wear-leveling of the SSD

*The maximum size can be up to 6GB x 2 (i.e. calculated based on maximum two txg group)
 

daimi

Dabbler
Joined
Nov 30, 2013
Messages
26

Chris LaDuke

Dabbler
Joined
Jan 17, 2014
Messages
18
Well I think alot has to do with your use of the pool (reads vs writes etc). Nothing beats trying it both ways and measuring the results. Good luck!
 
Status
Not open for further replies.
Top