LSI 9200-8e, DS4246, SSD Speeds and ses: transfer speeds

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
I'm running some tests on different configurations and am puzzled by some of the results I am getting.

My setup:
IBM x3650 m3 , 2 x5650 cpu, 128GB ram
LSI 9200-8e
2 x Netapp DS4246 with IOM6 (6 gbps)
8 x 4TB Seagate iron wolf in raidz2 sata
1 x samsung evo 850 120GB in stripe sata
Freenas 11.1-U6

test6.out = 42GB file generated with dd if=/dev/random, file is on th 8x4TB raidz2

/mnt/ssd=ssd stripe of samsung 850 evo 120GB

dd if=./test6.out of=/mnt/ssd/test1.dd bs=1024K

If I monitor the above command with zpool iostat 1 I see that the SSD pool tops out around 152MB/s for the transfer. The specs for the SSD show sustained write at ~500MB/s. I have tried this test in 2 different freenas servers with the same results.

This is my first time working with sas enclosures, specifically the DS4246. They have iom6 (6Gbps) in them and the 9200-8e is 6Gbps.
Looking at the below the disks are showing up as 600MB/s transfers but the enclosures are showing up as 300MB/s transfers. I'm unclear why there is a discrepancy.

I ran a test doing a dd directly to the raw SSD and got 381MB/s which is greater than 300MB/s so I must be at 6Gbps per channel on the sas.

root@hood:/mnt/vol1test/share # dd if=/dev/zero of=/dev/da11 bs=1024K count=820
820+0 records in
820+0 records out
859832320 bytes transferred in 2.255991 secs (381132791 bytes/sec)


Code:
da4: <ATA ST4000VN008-2DR1 SC60> Fixed Direct Access SPC-4 SCSI device
da4: Serial Number ZXXXXXX
da4: 600.000MB/s transfers
da4: Command Queueing enabled
da4: 3815447MB (7814037168 512 byte sectors)
da7: <ATA ST4000VN008-2DR1 SC60> Fixed Direct Access SPC-4 SCSI device
da7: Serial Number ZXXXXXX
da7: 600.000MB/s transfers
da7: Command Queueing enabled
da7: 3815447MB (7814037168 512 byte sectors)
ses0 at mps0 bus 0 scbus2 target 12 lun 0
ses0: <NETAPP DS424IOM6 0162> Fixed Enclosure Services SPC-3 SCSI device
ses0: Serial Number SHJ
ses0: 300.000MB/s transfers
ses0: Command Queueing enabled
ses0: SCSI-3 ENC Device
ses1 at mps0 bus 0 scbus2 target 19 lun 0
ses1: <NETAPP DS424IOM6 0162> Fixed Enclosure Services SPC-3 SCSI device
ses1: Serial Number SHU
ses1: 300.000MB/s transfers
ses1: Command Queueing enabled
ses1: SCSI-3 ENC Device
ses2 at mps0 bus 0 scbus2 target 21 lun 0
ses2: <NETAPP DS424IOM6 0173> Fixed Enclosure Services SPC-3 SCSI device
ses2: Serial Number SHU
ses2: 300.000MB/s transfers
ses2: Command Queueing enabled
ses2: SCSI-3 ENC Device
ses3 at mps0 bus 0 scbus2 target 22 lun 0
ses3: <NETAPP DS424IOM6 0162> Fixed Enclosure Services SPC-3 SCSI device
ses3: Serial Number SHJ
ses3: 300.000MB/s transfers
ses3: Command Queueing enabled
ses3: SCSI-3 ENC Device

 
Last edited:
Joined
May 10, 2017
Messages
838
The 120GB 850 EVO can only write @ 500MB/s while using the SLC buffer, aka turbo write, which for that model is only 3GB, after that is exausted sequencial write speed drops to 150MB/s.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Looking at the below the disks are showing up as 600MB/s transfers but the enclosures are showing up as 300MB/s transfers. I'm unclear why there is a discrepancy.
The enclosure may need a firmware update, but that might not be possible since they are NetApp. Many SAS enclosures that are rated for 600MB/s with SAS disks drop to 300MB/s with SATA disks. It is the price you pay for using older hardware like that, but it doesn't make a difference for spinning disks, which is the intended purpose of those enclosures, because the fastest spinning disk media will be slower than that.
https://mysupport.netapp.com/documentation/docweb/index.html?productID=61469
I didn't see any indication in the documentation that they were intended for use with SSDs.
1 x samsung evo 850 120GB in stripe sata
What is the intended purpose of this solitary disk? Are you installing it in one of the NetApp disk shelves?
2 x Netapp DS4246 with IOM6 (6 gbps)
Why so many enclosures for so few disks?
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
The 120GB 850 EVO can only write @ 500MB/s while using the SLC buffer, aka turbo write, which for that model is only 3GB, after that is exausted sequencial write speed drops to 150MB/s.

Thanks. I did a read test with dd and was able to sustain ~460MB/s over a 42GB read.
 
Last edited:

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
Hi Chris,

I found out that the ses: device is only for talking to the backplane of the IOM and used for state monitoring etc.

I've only got a few devices in the enclosure right now for testing. Once I have things working and confirmed expected performance I will backup my other freenas machine and move all the disks over. The single ssd was installed to see if I could read/write at > 3Gbps. I confirmed that by reading/writing directly to the device with dd.

I did another test where I read concurrently from the 8 x 4TB drives and the single ssd using pv to monitor. I was able to sustain ~190MB/s on the 8 x 4TB drives and ~270MB/s on the ssd. This puts me above the 1200MB/s SAS 3Gb and into the 2400MB/s SAS 6Gb territory.

dd if=/dev/da5 bs=1024K count=120000 | pv | dd of=/dev/null bs=1024K


Thanks
 

mjt5282

Contributor
Joined
Mar 19, 2013
Messages
139
i also bought a used Netapp ds4246 on ebay a couple of weeks ago. I put a couple of SATA 8x8 TiB raidz2 pools in it . I have a flashed M1015 IT mode (p20) SAS HBA adapter in my main chassis. I notice there is some SAS negotiation errors on boot-up. I have both IOM6 adapters engaged. I watched a video on youtube by Morten Hjorth that has the same or similar chassis with one IOM6 module unplugged. I think unless you have SAS multipath drives you only need one IOM module engaged. The throughput I observed so far with scrubbing was good - although something locked up once, I believe my HBA card overheated during the scrub. I needed to power cycle the ds4246 to recover. I turned the fans up on my supermicro chassis, this may have been unnecessary.

It would be helpful if we could pool our knowledge on this device.
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
It would be helpful if we could pool our knowledge on this device.

Sure. Also there are some good posts in reddit.com/r/homelab about the ds4243/6. The M1015 is internal connectors right? Are you connecting to the IOM6 with sff-8087 to QSFP cable? The 9200-8e (p20) has external sff-8088 connectors. I picked up two expensive :( sff-8088 to QSFP cables for it. Previously I was experimenting with the netapp pm8003 card. The driver for it was a bit buggy. It had problems reading SMART when the disk were under even small load.

These enclosures are made by Xyratex and are compatible with the controllers used in the Dell HB-1235. I bought some cheap ones on eaby "COMPELLENT 0952913-07 6GBPS SAS CONTROLLER HB-SBB2-E601-COMP" and will try those ones as well. The benefit is that they will work with cheaper sff-8088 cables. However I want to be sure that the fans operate correctly. I had read a post where someone said they wouldn't kick down to low RPM.

Can you post a snippit of what your negotiation errors looked like? I will check my logs for similar. I haven't had any lockups with mine. I also have both IOM6's engaged but only one cabled. I have also heard reports of issues with 2 IOM6's engaged. Yes you only need one IOM6 module. The other is for redundancy. Also with SATA the other module can't see the drives unless you have sas to sata interposers.

I have a 9207-8e coming to replace the 9200-8e but it won't be here for a few weeks.
 
Last edited:

mjt5282

Contributor
Joined
Mar 19, 2013
Messages
139
I have been purchasing these for all my chassi : CableDeconn Dual Mini SAS SFF-8088 to SAS36P SFF-8087 Adapter in PCI Bracket .
They convert the SFF-8088 to internal SFF-8087 . The QSFP+ to SFF-8088 cable was expensive on amazon, ~80 USD, there was a cheaper source on ebay, but it ships slow from Shenzen, China. I tried to boot last night with one of the IOM6 modules unplugged, none of the disks were found. Also wondering what the cat5/6 adapters functionality is on the back of the ds4246.
As a homelabber aside, the rack is situated in my garage, the fan noise is acceptable for the garage (only really blows loudly when power cycled). It seems to work pretty well with the standard IT flashed M1015 HBA card, so I am pretty happy. Sometimes upon reboot it doesn't seem to negotiate full speed with all the caddies, but I wrote a little script that shows current SATA version and throughput, and all 16 disks in the chassis are 6.0 Gb/s, most SATA 3.3 a couple SATA 3.2 .
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
I have seen it happen too where occasionally on reboot the sata drives negotiate at 300MB/s rather than 600MB/s. I can boot and see the disks with 1 or 2 controllers attached. The ethernet on the back of the controllers is for connecting to the netapp appliance. I don't know if the controller can be accessed through a standalone web gui. I doubt it.

Are you running 11.1 or 11.2?
 
Last edited:

mjt5282

Contributor
Joined
Mar 19, 2013
Messages
139
I'll probably never buy a netapp appliance :) ... I have been running FN11.2RC1 for several weeks but yesterday upgraded to the new 11.2RC2. I power cycled the disk shelf after I installed mirrored new SATA-DOM's and every disk was identified and no SAS negotiation errors. Though I have had single disk drop outs of both my 8x8 raidz2 arrays in the last couple of days. I also bought some new NetApp disk caddies that seem newer for one of my pools (they say 600 GB on them instead of 450 GB). I am not using interposers (none of my caddies came with them).

do you run an interconnect cable between IOM's ? or do you run 2 QSFP+ -> SFF-8088 cables to your HBA?
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
I have never had a disk dropout (knocking on wood). I am connected with 2 cables one from each ds4246 to the 9200-8e. This way I get 24 disks over each cable vs 48 over one with an interconnect.

The 600/450GB thing. Those are just inserts that go into the caddies.

I have the two hb-1235 controllers almost here (Canada Post on Strike) and will test with them when they arrive.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I notice there is some SAS negotiation errors on boot-up. I have both IOM6 adapters engaged. I watched a video on youtube by Morten Hjorth that has the same or similar chassis with one IOM6 module unplugged. I think unless you have SAS multipath drives you only need one IOM module engaged.
The multipath configuration can be done many ways. We have a system at work where there are two head units, one on standby, the other active and if the active one fails, the one on standby can pickup where the active unit left off. These two systems are each connected to the same set of disk shelves, one head to each IOM.
If you only have one server and one SAS controller, there is no advantage in using both IOM devices as it does not make the connection faster and it can cause some negotiation errors, as you have seen.
PS. Morten makes some great videos.
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
I upgraded to 11.2 RC2 from RC1 on my test instance (within last 2 days). Today I woke up to found the FreeNAS webgui and the ssh non responsive. I went to the console and noticed it repeating errors like:
swap_pager: indefinite wait buffer: bufobj: 0, blkno:2085785, size: 4096
The blkno and size were different on each line.

I couldn't get it to respond on the console even tried alternate consoles and it wouldn't let me type the the username to log in. What is weird is that I had a vm running via ISCSI off this freenas test instance and it was still operating. I was able to shut down the OS.

I am still debugging. I am not sure what happened and hopefully it was just coincidental around the 11.2 RC2 upgrade.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Definitely file a bug report. That's the second time in a day I have seen a similar report.

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
Something weird going on for sure. I enabled netdata and got this email. I have 128GB ram in this server.

30min ram swapped out = 63.1% of RAM

the amount of memory swapped in the last 30 minutes, as a percentage of the system RAM Alarm

swap Family

CRITICAL Severity

Sun Nov 18 13:50:46 MST 2018 Time
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
I upgraded to 11.2 RC2 (within last 2 days). Today I woke up to found the FreeNAS webgui and the ssh non responsive. I went to the console and noticed it repeating errors like:
swap_pager: indefinite wait buffer: bufobj: 0, blkno:2085785, size: 4096
The blkno and size were different on each line.

I couldn't get it to respond on the console even tried alternate consoles and it wouldn't let me type the the username to log in. What is weird is that I had a vm running via ISCSI off this freenas test instance and it was still operating. I was able to shut down the OS.

I am still debugging. I am not sure what happened and hopefully it was just coincidental around the 11.2 RC2 upgrade.
I have seen this issue a few times on 11.2-RC1 as well, and my system was put to a crawl.
However, as I was doing some work on it, mostly related to manual (script based replication) and dealing with various iocage jail installs, I was actually having several SSH sesions open where my command where executed within "screen" so that I don't loose connection when I shutdown my regular PC.

As it happened, I had lost Web GUI access and keyboard entry on the server itslef, but my "screen" sessions over SSH were active.
I didn't want to go through the hassle of rebooting the system, instead I was able to restart the iocage jails via SSH.
I think the error came from Freenas syslog-ng complaining it was not able to connect to the server.
After restarting the jails and disabling syslog-ng my system as finally stabilized and is no having those issues.

I suspect those are iocage jail related issues.
Under those circumstances, my swap would increase in size to the point Netdata would not be able to record anything during this event. I can't say if swap was maxed, but when Netdata started to become responsive the swap usage went back to about 50%.
 

Apollo

Wizard
Joined
Jun 13, 2013
Messages
1,458
Something weird going on for sure. I enabled netdata and got this email. I have 128GB ram in this server.

30min ram swapped out = 63.1% of RAM

the amount of memory swapped in the last 30 minutes, as a percentage of the system RAM Alarm

swap Family

CRITICAL Severity

Sun Nov 18 13:50:46 MST 2018 Time
Nothing to really be concerned about. This is just Netdata reporting this as a possible issue ( more like an information, I would say).
It is just letting you know that Freenas had to get rid of the cached data stored in RAM to make room for a different set of data which is not yet in RAM and is being accessed.
You will get lots of those warning, especially if you run replication where amount of data updated can be significant.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
upload_2018-11-18_1-58-33.png
 

xnaron

Explorer
Joined
Dec 11, 2014
Messages
98
When I said I upgraded to RC2 I meant I upgraded my test instance from RC1 to RC2. Yes I wouldn't dream of moving my production FreeNAS to 11.2. I have nothing of consequence on this test instance.
 
Top