Determining Bottleneck - IBM x3630 M3 as FreeNAS box

Status
Not open for further replies.
Joined
Oct 9, 2016
Messages
4
Hello! I am Gabriel, new to this forum, and kind of new to FreeNAS.

First of all, thanks for all your effort. I have been using FreeNAS for two months, and the experience have been very positive so far ;).

I want to ask you guys to help me understand how to determine bottlenecks when using FreeNAS, iSCSI protocol, 10 Gb Intel NICs and Broadcom NICs, Cisco Nexus Switch. I would like to determine my current bottleneck, but more importantly, to learn how to determine iSCSI/FreeNAS bottlenecks in general.

Here is my setup:
STORAGE
- IBM x3630 M3, with 24 2.5" HDD bays
- IBM M1015 as RAID/HBA card, configured with JBOD (still haven't flashed it to LSI "IT" mode to transform it to a pure HBA)
- 96 GB Ram
RaidZ1 HDD slots 1,2,3,4: IBM SAS 300 GB drives
RaidZ1 HDD slots 5,6,7,8: IBM SAS 600 GB drives
RaidZ1 HDD slots 9,10,11,12: Samsung SSD EVO 850, 500 GB SATA drives
RaidZ1 HDD slots 13,14,15,16: Seagate 2TB Laptop HDD SATA, Model ST2000LM007
Other drives in other bays but still unconfigured.
- For iSCSI: Intel X520 Dual Port 10GbE SFP+
- built-in NICs for MGMT
- iSCSI working, PORTAL defined with 2 different FreeNAS IPs, so multi-path can be used/configured by VMware

SERVER
- BladeCenter H - Chassis
- HS22 and HS23 Blades
- Blades configured with IBM Broadcom 10GB Gen 2 4-Port Ethernet Blade Card (gives connectivity to Chassis Switches)
- 2x Switch, Cisco Nexus 4001: SPF+ 10 Gb

So, in summary:
- A Blade has 10 Gb connectivity to each of the 2 Cisco Nexus Switch. Each Blade connected to the 2 switches
- The Switches are connected to the STORAGE using 2x SPF+ cables
- Each Blade is virtualized using VMware ESXi 6.0
- speed link forced to 10 Gb at Cisco Nexus, for the ports connected to the Blades and to the FreeNAS box
- multiple VMs running on the Blades, stored on the FreeNAS
- MTU size is the default for TCP/IP: 1500 (I still haven't tried jumbo frames, but the performance different, as far as I think, shouldn't be considerable, or maybe I am wrong?)

The situation:
- I would like to learn how to diagnose bottlenecks in this kind of configuration

I have a running Windows Server 2012 R2, on a zvol over a RaidZ1 using Samsungs SSD. The LUN has round-robin configured for multi-path at esxi level.
Using Crystal Disk Mark 5.2, I get the following results:

-----------------------------------------------------------------------
CrystalDiskMark 5.2.0 x64 (C) 2007-2016 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 601.955 MB/s
Sequential Write (Q= 32,T= 1) : 816.517 MB/s
Random Read 4KiB (Q= 32,T= 1) : 166.639 MB/s [ 40683.3 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 199.028 MB/s [ 48590.8 IOPS]
Sequential Read (T= 1) : 398.928 MB/s
Sequential Write (T= 1) : 650.615 MB/s
Random Read 4KiB (Q= 1,T= 1) : 10.388 MB/s [ 2536.1 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 18.991 MB/s [ 4636.5 IOPS]

Test : 8192 MiB [C: 53.1% (26.4/49.7 GiB)] (x5) [Interval=5 sec]
Date : 2016/11/14 15:36:05
OS : Windows Server 2012 R2 Server Standard (full installation) [6.3 Build 9600] (x64)
-----------------------------------------------------------------------

So, the results look good so far. So, here begins the investigation of bottlenecks:
while running the benchmark, I can verify that:
- inside Windows, HDD active time is 100% (so, at Windows level, the OS is using all it can)
- at FreeNAS, the CPU usage stays low (<25%): is CPU dependent the iSCSI functionality? supports multi-threading?
- at FreeNAS, the DISK throughput for each SSD stays below 150 MB/s (Sata 2 should get these SSD to around 250 MB/s each, at least) -> similar benchmark over a zvol using 4 SAS using RaidZ1 gets to similar levels
- at FreeNAS, the NICs (idx) get to 2.5 Gb/s each (they are 10 Gb nics, so, at least they should achieve more than that)
- at FreeNAS, the SCSI TARGET PORT get to ~ 500-600 MB/s

I don't know which is being the bottleneck! There is always a bottleneck, and I am trying to understand how to recognize it, because as far as I can tell:
- the storage has enough CPU (dual Xeons)
- 10 Gb NICs are not being saturated
- I doubt Cisco Nexus are being saturated (still I should validate that though)
- FreeNAS has plenty of RAM
- I think that sas/sata bus is not being saturated
- I think that pci-e 2.0 (for Intel NIC inside x3630 used for FreeNAs) is not being saturated

Any ideas? How should I proceed? Many thanks!!

PD: I am pretty happy with the performance, it is just I want to learn how to diagnose bottlenecks
 
Status
Not open for further replies.
Top