Supermicro X10SRH-CLN4F Server Performance Tests - Weird Results?

Status
Not open for further replies.

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Just a note: I don't know *anyone* that has come to the forum using a 3008 and said it worked fine and performed as expected. So clearly this isn't something you should even think is going to happen. If you search around for 3008, I'm sure you'll see what I mean. ;)

Mine works fine. Mine is a little different than some of the others who had trouble with a 3008 paired with a SAS2 backplane. :smile: And mine is the exact same setup as the OP except the drive type.
 

leonroy

Explorer
Joined
Jun 15, 2012
Messages
77
@HeloJunkie - couldn't help noticing your server is called plexnas. I just have to know, you planning on using it exclusively for Plex? :D
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
@leonroy

Yep, that is it's sole use for me. We have about 100TB deployed on Freenas currently for work and I was running a little QNAP 670-Pro at home for my Plex server, but at 12TB I am out of space and need more room so I decided to use Freenas for it as well.

:cool:
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
OK, tried again this morning with a couple of 128GB 6Gb/s SSDs:

Code:
=== START OF INFORMATION SECTION ===
Model Family:     Marvell based SanDisk SSDs
Device Model:     SanDisk SD6SB1M128G1022I
Serial Number:    143874401733
LU WWN Device Id: 5 001b44 c8b2a21c5
Firmware Version: X231600
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      Unknown (0x000a)
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 14 06:24:20 2015 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


Code:
[root@plexnas] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1              da2             cpu
  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
127.96 466 58.19  55.75   0  0.02  11.44  12  0.13   1  0  0  0 99
128.00 3322 415.21   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3315 414.36   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3318 414.73   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3319 414.92   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3322 415.23   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3328 415.98   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3320 415.04   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3329 416.10   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3330 416.29   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3329 416.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3324 415.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100

Started the second DD:

128.00 3320 414.98   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3327 415.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3324 415.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3323 415.35   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3316 414.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3316 414.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3292 411.55   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3327 415.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3329 416.10   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3329 416.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 3333 416.60   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
128.00 3327 415.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99


This time there was almost no change at all in the performance of the drive once I started the second dd command.

Here is the graph. This first long read was a single dd command, the second one was two dd commands running at the same time (dd if=/dev/da0 of=/dev/null bs=8M).
The slight gap in the first one was me stopping the process by accident.

da0_2xSSD.png



So now I am beginning to wonder. The SSDs are 6 Gb/s, the LSI3008 is a 6 Gb/s HBA, the SAS expander is 6 Gb/s and my HGST drives are SATA3 6 Gb/s but obviously something is not right. So I decided to pull all my drives and go with just two drives plugged into the SAS expander (Like the SSDs) and try again, maybe it has something to do with the number of drives plugged into the expander (I know, I know, I'm reaching here....but worth a test):

Here is the smartctl output for the for the drives being tested:

Code:
=== START OF INFORMATION SECTION ===
Model Family:     HGST Deskstar NAS
Device Model:     HGST HDN724040ALE640
Serial Number:    PK2338P4H9LXSC
LU WWN Device Id: 5 000cca 249d275e5
Firmware Version: MJAOA5E0
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 14 07:01:07 2015 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


The only thing interesting that I note is the sector size (512 logical but 4096 physical). Since I am seeing issues with the drives while addressing them directly, I don't think there should be any issues with forcing 4k sector sizes (which I think is the default in freenas now when you create a new pool).

and...FAIL....

Code:
[root@plexnas] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1              da2             cpu
  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
128.00 1313 164.15   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1325 165.60   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1278 159.73   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1304 162.98   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1296 162.04   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1303 162.92   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1316 164.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 1323 165.42   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 988 123.50   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100

Second DD started:

128.00 436 54.47   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 351 43.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 313 39.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 321 40.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 435 54.35   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 358 44.73   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 321 40.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 316 39.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
128.00 251 31.42   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100



da0_1xHGST.png



So I grabbed a Seagate SATA3 6 Gb/s drive we had laying around still wrapped and tried again:

Code:
=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST250DM000-1BD141
Serial Number:    Z3TS6YMX
LU WWN Device Id: 5 000c50 06508eab2
Firmware Version: KC48
User Capacity:    250,059,350,016 bytes [250 GB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 14 07:45:41 2015 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


Code:
[root@plexnas] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1             cpu
  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
127.70  35  4.33  11.56  17  0.19   1  0  0  0 99
128.00 1092 136.55   0.00   0  0.00   0  0  0  0 100
128.00 1091 136.37   0.00   0  0.00   0  0  0  0 100
128.00 1096 137.06   0.00   0  0.00   0  0  0  0 100
128.00 1092 136.56   0.00   0  0.00   0  0  0  0 100
128.00 1096 137.06   0.00   0  0.00   0  0  0  0 100
128.00 1091 136.43   0.00   0  0.00   0  0  0  0 100
128.00 1097 137.12   0.00   0  0.00   0  0  0  0 100

Second DD Command started:

128.00 242 30.30   0.00   0  0.00   0  0  0  0 100
128.00 209 26.17   0.00   0  0.00   0  0  0  0 100
128.00 229 28.67   0.00   0  0.00   0  0  0  0 100
128.00 225 28.17   0.00   0  0.00   0  0  0  0 100
128.00 188 23.55   0.00   0  0.00   0  0  0  0 100
128.00 223 27.92   0.00   0  0.00   0  0  0  0 100


And here is the graph:

da0_1xSeagate6gbs.png


So I now know it is not specific to the HGST drives, although it appears (as of right now) to be specific to spinning media.

Tomorrow I will try a couple of 15K RPM SAS drives that I have left over from another project and see what happens with those drives.
 
Last edited:

leonroy

Explorer
Joined
Jun 15, 2012
Messages
77
Nope - only when I hit the same drive more than once....

I thought it was the case that if you hit spinning media with more than two or more concurrent operations you reduce effective performance since the heads are effectively seeking back and forth to service both operations?
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
I was thinking the same thing, but this does not happen on my other Freenas servers.

This is from another Supermicro server running WD Reds and the M1015:

Code:
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red (AF)
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4EFFXUNFU
LU WWN Device Id: 5 0014ee 260582bfe
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Apr 14 08:14:28 2015 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


Code:
[root@scruffy] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1              da2             cpu
  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
 115.61  14  1.63  17.77   2  0.03  17.93   2  0.03   0  0  0  0 100
 128.00 1224 153.03   0.00   0  0.00   0.00   0  0.00   0  0  1  0 99
 124.46 1171 142.38   8.55  40  0.33   8.66  39  0.33   0  0  0  0 99
 128.00 1220 152.49   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
 128.00 1201 150.16   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1155 144.36   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1230 153.78   0.00   0  0.00   0.00   0  0.00   0  0  1  0 99
 128.00 1228 153.49   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1241 155.17   0.00   0  0.00   0.00   0  0.00   0  0  1  0 99
 128.00 1222 152.72   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 865 108.08   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
 
Start of second DD:

128.00 146 18.29   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1133 141.62   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1236 154.48   0.00   0  0.00   0.00   0  0.00   0  0  1  0 99
 127.07 1160 143.99   5.86  14  0.08   6.76  14  0.10   0  0  1  0 99
 126.42 1119 138.14   5.78  13  0.08   5.17  12  0.06   0  0  1  0 99
 128.00 1202 150.30   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1221 152.66   0.00   0  0.00   0.00   0  0.00   0  0  1  0 99
 128.00 1219 152.35   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
 128.00 1195 149.43   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1205 150.67   0.00   0  0.00   0.00   0  0.00   0  0  1  0 99


And the graph:
scruffy_da0.png
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
I bought the system in Dec. I don't have the Board Rev (it's running and I don't have enough slack in my cabling to pull it out while it's running to check). :)

From IPMI:
Firmware Revision : 01.51
Firmware Build Time : 06/28/2014
BIOS Version : 1.0
BIOS Build Time : 07/02/2014


Well, I am running a tab bit newer BIOS, 1.0a from 09/19/2014...
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
OK, I think I have hit a wall here with the hardware that I have onhand. I did a few network performance tests and posted it here but here are my next steps that I have planned to troubleshoot the hardware:

1) Install M1015 HBA. Received the cables today necessary to connect it to the LSI SAS3 backplane (SFF-8087 to SFF-8643). Since I can install without worrying about the zpool, I will do this first and then rerun my network tests against the box while running the 1015 as well as my dd tests against the zpool.

2) Run dd tests against individual drives to see if I can repeat the problems I am seeing above with multiple dds to the same drive.

3) If #2 is a repeat of above, then I wil install a couple of 500gb 15k SAS drives I have and try #2 again. Based on the fact that the SSDs did not show the same problem as the spinning media, it will be interesting to see the results of tomorrows tests!
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Well....The M1015 card is acting exactly like the LSI3008 onboard controller as it relates to multiple dd operations to the same dev at the same time. The only time this is not the case is with the SSD drives that I tried.

2xddM1015.png


Code:
[root@plexnas] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1              da2             cpu
  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
 124.25 405 49.12  90.19  39  3.43  93.44  52  4.77   0  0  1  1 98
 128.00 1259 157.34   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1265 158.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 99
 128.00 1287 160.92   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1302 162.73   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1259 157.36   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1252 156.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1254 156.73   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1268 158.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100

Second dd started:

 128.00 420 52.54   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 424 53.04   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 430 53.79   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 429 53.66   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 429 53.60   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 437 54.66   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 431 53.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 421 52.60   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 428 53.47   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100



I do not know what to make of the information. It would seem that if it were the controller, changing to the M1015 would not show the same results, but if it were the backplane, I would have assumed that we would see the same results for all hard drives, not just spinning media.

Does anyone have any idea how to troubleshoot further...?
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Try a spinning SAS (vs SATA).
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
I will try that next and report back.

I wonder if anyone has this server with SATA?
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
OK, so here are the SAS drive results while connected to the M1015 HBA:

Code:
[root@plexnas] ~# smartctl -a /dev/da11
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p13 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               FUJITSU
Product:              MBA3300RC
Revision:             D305
User Capacity:        300,000,000,000 bytes [300 GB]
Logical block size:   512 bytes
Rotation Rate:        15000 rpm
Logical Unit id:      0x500000e01a254b10
Serial number:        BJ03P82012N4
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Thu Apr 16 08:06:08 2015 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK



And here are the results:

Code:
[root@plexnas] ~# iostat -w 2 -d  da11
            da11
  KB/t tps  MB/s
 128.00 918 114.69
 128.00 917 114.63
 128.00 915 114.32
 128.00 920 115.00
 128.00 913 114.13
 128.00 921 115.07
 128.00 916 114.44
 128.00 918 114.76
 128.00 916 114.44
 128.00 920 115.01
 128.00 834 104.26
 
Start of second dd:

128.00 503 62.91
 128.00 495 61.84 
 128.00 495 61.84 
 128.00 519 64.91 
 128.00 491 61.41 
 128.00 498 62.28 
 128.00 495 61.84 
 128.00 492 61.47


And the graph:

sas_drive_2xdd.png




And just to make sure, I removed the M1015, hooked back up the LSI3008 and saw the exact smae thing:


Code:
[root@plexnas] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1              da2             cpu
  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
 128.00 665 83.07  44.37   0  0.01  44.83   0  0.01   0  0  0  0 100
 128.00 881 110.10   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 877 109.63   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 883 110.32   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 875 109.32   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 882 110.26   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 877 109.63   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 879 109.82   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100

Start of second dd:

 128.00 634 79.27   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 464 58.03   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 480 59.97   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 471 58.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 471 58.85   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 470 58.72   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 466 58.22   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 468 58.47   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 465 58.10   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100


And the graph:

sas_drive_2xdd_3008.png






So, it seems like all drives are affected in the same manner except the SSD drives that I tested.

Since I am using multiple HBAs (3008 & 1015) and multiple drives, it would seem to point (now) to the SAS expander as the obvious issue, but why did I not see this behaviour on the SSDs?
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
So I figured out a few more tests that I could do to try and narrow down the problem. The SAS expander has four ports on the back of it, the controllers were hooked up to the bottom two ports the whole time, so I used new cables and connected the onboard LSI3008 to the upper two ports instead of the lower ports. This eliminated the cables (which I had already done by using new cables on the M1015 HBA) but this would eliminate a bad port on the expander.

Of course, I did not see any change, nor did I expect any change:

Code:
[root@plexnas] ~# iostat -C -w 2 -d -t da /dev/da0
             da0              da1              da2             cpu
  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
 128.00 1293 161.67   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1303 162.92   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1311 163.86   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1328 165.98   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1276 159.48   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1305 163.17   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1293 161.61   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1304 162.98   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 1312 163.98   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 821 102.57   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100

Start of second dd:

 128.00 598 74.71   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 598 74.71   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 599 74.84   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 451 56.35   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 448 55.97   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 443 55.35   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 555 69.40   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 606 75.71   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100
 128.00 602 75.21   0.00   0  0.00   0.00   0  0.00   0  0  0  0 100


My next test will be to use a totally separate machine with the M1015 card installed, run the cables from the original SAS expander to this new machine thereby bypassing the entire supermicro server, install a fresh copy of freenas on the new system and test again. This will completely isolate the SAS expander form the rest of the server/MB/CPU/RAM and let me see if the problem still exists. If it does, then I will get a new SAS expander sent over!

Stay Tuned.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Next test completed -

Took the M1015, installed it into another stand-alone box. Disconnected the SAS Expander backplane from the supermicro server and connected to to the M1015 in the new box. Installed the latest version of Freenas on new server and booted it. Ran the same dd tests against a couple of the drives stall attached to the SAS Expander but being run form the 1015 and the new server.

Same exact problem. As soon as a second dd is started, tps crash to 1/3 to 1/4 of the normal performance. So in 100% completely isolating the backplane from the supermicro motherboard (new MB was an Asus) and the onboard SI3008 controller, everything points to the expander/backplane.

The only question that still remains is why did I not see this behaviour with SSDs, only spinning media?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Can you try without the expander (drives directly connected to the controller)? because even if everything seems to point to the expander you can't be sure unless you test without it and everything works ok :)
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
You probably did this, but I didn't see where you connected drives directly to the onboard sata ports, or the sas ports directly. Is there a baseline somewhere to suggest that this isn't normal when you add the second operation to the queue? This seems to be a latency issue, that the ssd's are fast enough to minimize. The physical drives are waiting on an ack or sync or something that is crushing them, or switching contexts between the processes somehow. Hard to say what it is exactly... but so far it is hard to differentiate between what might be normal due to pool config, or your specific settings, and what might be anomaly.

<I see I am not alone in this idea. Hitting enter any way..>
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
Yes I did that already - as soon as the expander is out of the picture, the problem goes away.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Ok, then you can be sure it IS the expander :)

Now I can't say what happens exactly that leads to this problem but at least you know what to do to solve the problem.
 

HeloJunkie

Patron
Joined
Oct 15, 2014
Messages
300
You probably did this, but I didn't see where you connected drives directly to the onboard sata ports, or the sas ports directly. Is there a baseline somewhere to suggest that this isn't normal when you add the second operation to the queue? This seems to be a latency issue, that the ssd's are fast enough to minimize. The physical drives are waiting on an ack or sync or something that is crushing them, or switching contexts between the processes somehow. Hard to say what it is exactly... but so far it is hard to differentiate between what might be normal due to pool config, or your specific settings, and what might be anomaly.

<I see I am not alone in this idea. Hitting enter any way..>

Yes, as soon as the expander is out of the picture, the problem does go away. I have not been able to test from the onboard LSI3008 controller directly to the hard drives as I do not have the correct cable. However @depasseg has the identical motherboard, controller and sas expander and does not see the problem at all. I have never seen this problem on any other build that I have (about 100TB on Freenas now) including several supermicro boards. I use @jgreco scripts for HD testing and that is when I first noticed the problem. I pulled my baselines form my last supermicro server and the problem did not exist. I have tested the exact M1015 controller I had in that box attached to that expander in another box (no other common parts) and still see the problem. As soon as I pull one of the drives from the expander, plug it into the M1015, problem goes away.
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
At this point, I'd contact SuperMicro. I don't think I applied any updates that would have affected the expander. But I guess that could be a difference. Mine is at 4.1 (0401)

Out of curiosity, what is the output of dmesg | grep ses0

Code:
[root@freenas1] ~# dmesg |grep ses0
ses0 at mpr0 bus 0 scbus0 target 20 lun 0
ses0: <LSI SAS3x28 0401> Fixed Enclosure Services SCSI-5 device
ses0: Serial Number        
ses0: 1200.000MB/s transfers
ses0: Command Queueing enabled
ses0: SCSI-3 ENC Device
ses0: da0: Element descriptor: 'Slot00'
ses0: da0: SAS Device Slot Element: 1 Phys at Slot 0
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f012b10d2
ses0: da1: Element descriptor: 'Slot01'
ses0: da1: SAS Device Slot Element: 1 Phys at Slot 1
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01e1273e
ses0: da2: Element descriptor: 'Slot02'
ses0: da2: SAS Device Slot Element: 1 Phys at Slot 2
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01d2248e
ses0: da3: Element descriptor: 'Slot03'
ses0: da3: SAS Device Slot Element: 1 Phys at Slot 3
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f012b109e
ses0: da4: Element descriptor: 'Slot04'
ses0: da4: SAS Device Slot Element: 1 Phys at Slot 4
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01d20026
ses0: da5: Element descriptor: 'Slot05'
ses0: da5: SAS Device Slot Element: 1 Phys at Slot 5
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f012b101a
ses0: da6: Element descriptor: 'Slot06'
ses0: da6: SAS Device Slot Element: 1 Phys at Slot 6
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01d21c12
ses0: da7: Element descriptor: 'Slot07'
ses0: da7: SAS Device Slot Element: 1 Phys at Slot 7
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01e12766
ses0: da8: Element descriptor: 'Slot08'
ses0: da8: SAS Device Slot Element: 1 Phys at Slot 8
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f0123baea
ses0: da9: Element descriptor: 'Slot09'
ses0: da9: SAS Device Slot Element: 1 Phys at Slot 9
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01237196
ses0: (none): Element descriptor: 'Slot10'
ses0: (none): SAS Device Slot Element: 1 Phys at Slot 10
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f012b111a
ses0: (none): Element descriptor: 'Slot11'
ses0: (none): SAS Device Slot Element: 1 Phys at Slot 11
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 500304800157e43f addr 50000c0f01e12206
[root@freenas1] ~# 
 
Status
Not open for further replies.
Top