Slow Transfer speeds

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Running Raid Z2 via a IT MODE HBA freenas RAID CARD controller Fujitsu D2607-A21 8i

I'm currently transfering across approx 8TB over a 10GB connection using SMB, I started off at around 170MB/s ( I dont have any cache SSD's, just using hdd's) and after about 10 minutes this has dropped to around 25-35 MB's. Anything that can be causing this? Not sure if cache is enabled on the HBA controller or if you even can use cache now its in IT mode, I cant seem to access any firmware on it during boot. If I pause the transfer and start it up a few minutes later I get good transfer speeds again until it drops off shortly after.

I have just ran an iperf test and am getting 2.31 Gbits/sec. Using two emulex oneconnect oce 1102-n-x cards. No drivers have been installed for the network cards on either machine, one on windows 10 and one TrueNas

No other machine is transfering anything to the Nas.

Pool is set as lz4, raidz2 currently with x4 8TB's (more to be added)
checksum on
acl mode passthrough
sync standard
enable atime off

any help would be appreciated
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Please post full details of your source and destination system, especially the model of drives being used.

Edit: I found your other thread about the Dell T320 system being used here, but it's still missing some of the critical information such as drive model.


The symptoms you're describing of "speed drops, but recovers if a pause in transfer happens" usually result from either the pool devices being too slow (if they're SMR and needing to rewrite, this could be the case) or thermal throttling from an overheating component (your network card, or your HBA itself)

Testing the link itself using iperf will also ensure that you're able to get the full network speed, as 170MB/s seems quite slow for a 10Gbps network; it should at least begin the transfer at a much higher rate before being write throttled.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Source is windows 10, i5 quad core. Using the same drives as the destination nas.

They are smr not cmr, but on a local machine copying from one drive to the other I do not get the issue.

I don't understand either why the network speed is slow, but can't find any drivers for the emulex oneconnect oce 1102-n-x cards however the iperf result of 2.3 gbs is enough to do a decent copy. If I run an iperf with the transferring happening I still get 1.3gbps so I guess it can't be the card throttling?

The drives are seagate 8TB ST8000DM004
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Destination pc is a dell t320 with a IT MODE HBA freenas RAID CARD controller Fujitsu D2607-a21 8.

I do have a couple of other raid controllers but never figured out how to flash to IT mode
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
I have done some more testing.

I Plugged an SSD into the second onboard sata slot of the Dell T320 and setup a pool the same way I did the first pool.

On a seperate Windows machine with a 1 gigabit connection going to the onboard Lan of the dell (confirmed via iperf) I transfer 20GB to the SSD pool and get the same transfer speeds of around 30MB/s

This eliminates the HBA, Network card and the spinning disks
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I don't really see a complete list of specifications as per forum rules.
How much RAM do you have? Do you suffer these issues only during writes?
What NIC are you using?
 
Last edited:

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
I don't really see a complete list of specifications as per forum rules.
How much RAM do you have? Do you suffer these issues only during writes?
What NIC are you using?
  • Motherboard make and model - Dell PowerEdge T320 IDRAC7, Basic
  • CPU make and model - Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz
  • RAM quantity - 80GB ECC
  • Hard drives - x4 8TB seagate 8TB ST8000DM004, in Raidz2, 500GB SSD for boot
  • Hard disk controllers - IT MODE HBA freenas RAID CARD controller Fujitsu D2607-A21 8i (unknown firmware or settings)
  • Network cards - emulex oneconnect oce 1102-n-x (I assumed plug and play for drivers on TrueNas) Iperf shows 2.5Gbit/Sec,
  • Network Cards - the 1Gbit built in port. I have tried both

Read speeds back to the same model 8TB drive in a windows 10 machine are the same at around 35MB/s. This doesnt even start high and drop.
Same speeds going from Nas to an SSD on a windows 10 Machine

I have also tried a higher spec windows 10 machine and that gets the same read/write speeds to the Nas.

An SSD on the onbaord slot of the Dell T320 in a pool of its own also gets these speed issues

I have tried the same files across the 1Gigabit network that the dell is connected to but copying from windows to windows and that maxes the bandwidth of the network but consistently copies at 112MB/S. I dont have an additional 10Gigabit card to show those results.

When I do copy to the Nas its up to 100GB's I get good write speeds of about 120MB/s before it drops to around the 25-35MBs

Thankyou
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
2.3Gbps reduces to around 300MB/s which is above your "starting" transfer speed, so while the network cards certainly aren't capable of a full 10Gbps they should be able to transfer faster than your disks.

Hard drives - x4 8TB seagate 8TB ST8000DM004, in Raidz2, 500GB SSD for boot

Those SMR drives will very likely be a problem later on. Were they previously used in any other system, or were they newly installed into TrueNAS?

Given that your boot device is a 500G SSD, I would also relocate your system dataset to the boot-pool in order to avoid the minor logging/updates to that pool from causing more reshingles on the SMR drives - especially if the SMR drives have had previous use in another system.

Hard disk controllers - IT MODE HBA freenas RAID CARD controller Fujitsu D2607-A21 8i (unknown firmware or settings)

Please post the output of sas2flash -list inside of CODE tags, this will tell you if it's in proper IT mode and what firmware it's using.

Your test with the SSD is interesting as well, since as you mentioned it is over 1Gbps and does not involve the HBA, but you're still seeing throttling down to 30MB/s rates. It's possible that the onboard SATA controller isn't using the drive cache either. Are you receiving any alerts in the iDRAC interface of the Dell machine about hardware health?
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Thanks for the reply.

Those drives were originally from external storage where I started storing all the cctv data, it's recently I've decided to try and do things properly. I was planning on slowly adding the cmr Western Digital Red plus drives to the array.

I have ordered another 10gigabit card and another sfp+ cable to try and fault find the network speeds, however that's hopefully not a concern for now.

I haven't done anything with Idrac, I will look at this now. Other than switching the dell to uefi I've just installed truenas and not touched it.

Where am I writing sas2flash -list? In the shell of truenas or is that to do with Idrac aswell?

The front lcd on the dell is blue with an ST number though, apparantly blue means normal operation no errors to report.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Those drives were originally from external storage where I started storing all the cctv data, it's recently I've decided to try and do things properly. I was planning on slowly adding the cmr Western Digital Red plus drives to the array.

The goal should be to eventually replace all of the SMR drives with CMR. As they were used before, and then reformatted, the drives themselves may still need to "reshingle" their zones.

Where am I writing sas2flash -list? In the shell of truenas or is that to do with Idrac aswell?

This would be in the TrueNAS shell - I recommend you set up SSH access to the system and connect that way, as it's easier to copy and paste. The line for "Firmware Product ID" should tell you if you are in IT mode though.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
The goal should be to eventually replace all of the SMR drives with CMR. As they were used before, and then reformatted, the drives themselves may still need to "reshingle" their zones.



This would be in the TrueNAS shell - I recommend you set up SSH access to the system and connect that way, as it's easier to copy and paste. The line for "Firmware Product ID" should tell you if you are in IT mode though.
OK thanks. I'm home shortly so will check.

Can the array run some smr and some cmr without causing issues?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
OK thanks. I'm home shortly so will check.

Can the array run some smr and some cmr without causing issues?

The SMR drives will cause slow write performance in the vdevs - especially in a RAIDZ2 where there is a larger number of devices per vdev that have to "agree" or "work together" in a sense.

Some users have been able to get SMR to provide "adequate" performance in a WORM (write once, read many) scenario for archive workloads, but care needs to be taken to avoid overwhelming the "cache" area on the SMR HDDs (often only a few dozen GB) - if they are to be used, they should be in their own vdev, preferably mirrored.
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Can the array run some smr and some cmr without causing issues?
Let me rephrase that question to point out the issue: Can I run a mix of CMR and SMR drives without problems showing up?

Yes, that is possible. On the other hand there are two issue with the approach:
  1. No problems showing up, does not mean that nothing is brewing up under the surface. It often takes not one thing to go wrong but several, so the absence of an alert being shown, does not mean that everything is ok.
  2. That is of course also the case for a CMR-only setup. But using SMR drives (exclusively or mixed with CMR ones), means to challenge your luck more than necessary.
Using TrueNAS typically means that you truly value your data. In that case, the strong recommendation is to not use SMR drives at all. Heavier load will increase your chance of seeing problems.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
The SMR drives will cause slow write performance in the vdevs - especially in a RAIDZ2 where there is a larger number of devices per vdev that have to "agree" or "work together" in a sense.

Some users have been able to get SMR to provide "adequate" performance in a WORM (write once, read many) scenario for archive workloads, but care needs to be taken to avoid overwhelming the "cache" area on the SMR HDDs (often only a few dozen GB) - if they are to be used, they should be in their own vdev, preferably mirrored.
Ok thanks, a few dozen gb for cache? seagate 8TB ST8000DM004 is showing 256MB of cache, or am i missing something? Is this where my write speed issues lie? Anything up to 100GB write speeds are fine.

Here is the info on the card, I see it says UEFI BSD version, does this matter than I installed truenas as a uefi?

Adapter Selected is a LSI SAS: SAS2008(B2)

Controller Number : 0
Controller : SAS2008(B2)
PCI Address : 00:08:00:00
SAS Address : 5003005-7-02db-f648
NVDATA Version (Default) : 14.01.00.08
NVDATA Version (Persistent) : 14.01.00.08
Firmware Product ID : 0x2213 (IT)
Firmware Version : 20.00.07.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9211-8i
BIOS Version : 07.39.02.00
UEFI BSD Version : N/A
FCODE Version

Thanks ChrisRJ for the response too
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Ok thanks, a few dozen gb for cache? seagate 8TB ST8000DM004 is showing 256MB of cache, or am i missing something? Is this where my write speed issues lie? Anything up to 100GB write speeds are fine.

There's 256MB of RAM on the disks, but the platters themselves are divided into the overlapping SMR (Shingled) portions and the CMR (conventional, non-overlapping) area - the latter is usually a few dozen GB in size, and designed to let the drive take in a batch of writes quickly, and then copy it to the higher-density SMR zones. The cache area is also used when the SMR zones need to be rewritten.

If you open gstat -dp on an SSH session during the performance degradation, do you see a high value for the ms/w (write times)?

Firmware Product ID : 0x2213 (IT)
Your HBA is in IT mode, which is correct. Does the SSD still have poor performance when connected to the HBA (if possible, you might need a 2.5" to 3.5" bay adapter for this?)
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
I tried IDrac, and got an SSL error, google says to update firmware, I tried in the lifecycle controller but it didnt seem to like the path I gave it on the usb. But then I found in the lifecycle controller a test hardware button, so thats handy.

Only error I got was this:
Msg: PCIE - training error PCI tag-0A00 VendorID-1000 DeviceID-0072 SVid-1734 SDid-1177 Bus 0A: Link Degraded, maxWidth = x8, negotiated Width = x4, slot 3.

Slot 3 is a 16 lane with the HBA in. It has been in Slot 5 which is 4 lane PCIE


Onto your stat request gstat -dp
SSD Start: 1.5-2.0 ms/w and write speeds of around 160MB/s
SSD about 60GB's in: peaked at around 8.4 ms/w and speeds dropped to around 105 MB/s and speeds didnt recover. But ms/w did bounce around a lot, they didnt stay high

%b was always around 99-101

I shall test the main array in the morning
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Msg: PCIE - training error PCI tag-0A00 VendorID-1000 DeviceID-0072 SVid-1734 SDid-1177 Bus 0A: Link Degraded, maxWidth = x8, negotiated Width = x4, slot 3.

Slot 3 is a 16 lane with the HBA in. It has been in Slot 5 which is 4 lane PCIE

The HBA should be negotiating x8 lanes. I'd consider removing the HBA and checking both the far edge of the card and slot for corrosion/debris, and reseating the HBA (the x16 "Slot 3" is the correct one to use, as the other slots are x4 or lower electrically) - your 10GbE NIC should be in slot 4 or 6, as the lowest two slots run off the PCH.

It's odd that speeds seem to be limiting you at the 160-170MB/s mark regardless of the pool media. Have you tested the pool speed locally using fio?

SSH in, change directory to your pool, and run:

fio --name=testrun --ioengine=posixaio --bs=1M --numjobs=1 --size=2G --iodepth=1 --runtime=30 --time_based --rw=write --eta-newline=10s

I'd expect this run to blaze along at an unrealistically fast speed (at or over 1GB/s) because it's going to fit entirely inside of your async/txg group (--size=2G), and therefore prevent write throttling from ever kicking in - but if this still caps out at the same 170MB/s then there's something else funny happening.
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
The HBA should be negotiating x8 lanes. I'd consider removing the HBA and checking both the far edge of the card and slot for corrosion/debris, and reseating the HBA (the x16 "Slot 3" is the correct one to use, as the other slots are x4 or lower electrically) - your 10GbE NIC should be in slot 4 or 6, as the lowest two slots run off the PCH.

It's odd that speeds seem to be limiting you at the 160-170MB/s mark regardless of the pool media. Have you tested the pool speed locally using fio?

SSH in, change directory to your pool, and run:

fio --name=testrun --ioengine=posixaio --bs=1M --numjobs=1 --size=2G --iodepth=1 --runtime=30 --time_based --rw=write --eta-newline=10s

I'd expect this run to blaze along at an unrealistically fast speed (at or over 1GB/s) because it's going to fit entirely inside of your async/txg group (--size=2G), and therefore prevent write throttling from ever kicking in - but if this still caps out at the same 170MB/s then there's something else funny happening.
Reseated the HBA, no signs of damage.


Not sure if this is quite right, wasnt sure how to change directory to the pool. Currently only had 1 pool which was the SSD. I got this info by typing in your above line:

root@truenas[~]# /mnt/SSD
zsh: permission denied: /mnt/SSD
root@truenas[~]# /SSD
zsh: no such file or directory: /SSD

root@truenas[~]# fio --name=testrun --ioengine=posixaio --bs=1M --numjobs=1 --size=2G --iodepth=1 --runtime=30 --time_based --rw=write --eta-newline=10s
testrun: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
testrun: Laying out IO file (1 file / 2048MiB)
Jobs: 1 (f=1): [W(1)][38.7%][w=2268MiB/s][w=2267 IOPS][eta 00m:19s]
Jobs: 1 (f=1): [W(1)][73.3%][w=1581MiB/s][w=1581 IOPS][eta 00m:08s]
Jobs: 1 (f=1): [W(1)][100.0%][w=2182MiB/s][w=2182 IOPS][eta 00m:00s]
testrun: (groupid=0, jobs=1): err= 0: pid=11741: Tue Oct 4 12:41:15 2022
write: IOPS=1423, BW=1424MiB/s (1493MB/s)(41.7GiB/30004msec); 0 zone resets
slat (usec): min=10, max=4130, avg=52.87, stdev=95.16
clat (usec): min=2, max=20173, avg=647.70, stdev=1215.89
lat (usec): min=175, max=20208, avg=700.58, stdev=1214.32
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 169], 20.00th=[ 172],
| 30.00th=[ 172], 40.00th=[ 174], 50.00th=[ 174], 60.00th=[ 176],
| 70.00th=[ 180], 80.00th=[ 196], 90.00th=[ 3523], 95.00th=[ 3654],
| 99.00th=[ 4752], 99.50th=[ 4752], 99.90th=[ 4948], 99.95th=[ 8029],
| 99.99th=[13435]
bw ( MiB/s): min= 249, max= 5044, per=100.00%, avg=1440.37, stdev=1873.76,samples=59
iops : min= 249, max= 5044, avg=1440.02, stdev=1873.83, samples=59
lat (usec) : 4=7.67%, 10=0.01%, 100=0.01%, 250=75.00%, 500=1.29%
lat (usec) : 750=0.01%, 1000=0.20%
lat (msec) : 2=3.32%, 4=10.11%, 10=2.36%, 20=0.04%, 50=0.01%
cpu : usr=3.57%, sys=0.68%, ctx=44689, majf=1, minf=1
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,42718,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
WRITE: bw=1424MiB/s (1493MB/s), 1424MiB/s-1424MiB/s (1493MB/s-1493MB/s), io=41.7GiB (44.8GB), run=30004-30004msec
root@truenas[~]#


I then set back up the raidz2 pool with the x4 8TB's and did the above:


root@truenas[~]# fio --name=testrun --ioengine=posixaio --bs=1M --numjobs=1 --size=2G --iodepth=1 --runtime=30 --time_based --rw=write --eta-newline=10s
testrun: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
Jobs: 1 (f=1): [W(1)][38.7%][w=2346MiB/s][w=2345 IOPS][eta 00m:19s]
Jobs: 1 (f=1): [W(1)][71.0%][w=4637MiB/s][w=4636 IOPS][eta 00m:09s]
Jobs: 1 (f=1): [W(1)][100.0%][w=5142MiB/s][w=5142 IOPS][eta 00m:00s]
testrun: (groupid=0, jobs=1): err= 0: pid=2004: Tue Oct 4 12:57:43 2022
write: IOPS=1432, BW=1433MiB/s (1502MB/s)(42.0GiB/30001msec); 0 zone resets
slat (usec): min=10, max=5818, avg=49.21, stdev=93.82
clat (nsec): min=1995, max=868003k, avg=646808.67, stdev=4770305.55
lat (usec): min=171, max=868021, avg=696.02, stdev=4770.07
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 165], 20.00th=[ 167],
| 30.00th=[ 169], 40.00th=[ 169], 50.00th=[ 172], 60.00th=[ 172],
| 70.00th=[ 174], 80.00th=[ 178], 90.00th=[ 2704], 95.00th=[ 3621],
| 99.00th=[ 4752], 99.50th=[ 4752], 99.90th=[ 6915], 99.95th=[ 9765],
| 99.99th=[24773]
bw ( MiB/s): min= 63, max= 5167, per=98.73%, avg=1414.62, stdev=1898.29, samples=58
iops : min= 63, max= 5167, avg=1414.10, stdev=1898.25, samples=58
lat (usec) : 2=0.01%, 4=6.66%, 10=0.02%, 20=0.01%, 50=0.03%
lat (usec) : 100=0.14%, 250=75.79%, 500=1.98%, 750=0.01%, 1000=0.20%
lat (msec) : 2=3.23%, 4=9.51%, 10=2.39%, 20=0.03%, 50=0.01%
lat (msec) : 500=0.01%, 1000=0.01%
cpu : usr=3.39%, sys=0.94%, ctx=45622, majf=1, minf=1
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,42985,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
WRITE: bw=1433MiB/s (1502MB/s), 1433MiB/s-1433MiB/s (1502MB/s-1502MB/s), io=42.0GiB (45.1GB),

But again I didnt assign a pool to do that to, so no idea if thats right.

Onto your gstat -dp request for the raidz2 pool
At start:
up to 30 ms/w

Once write speeds drops to around the 25/35MB/s:
up to 269ms/w (not all drives at once, and the drive recovers back down to 3-4ms/w whilst another drive spikes) %b up to 110% on drives, but again drops and recovers
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Not sure if this is quite right, wasnt sure how to change directory to the pool. Currently only had 1 pool which was the SSD. I got this info by typing in your above line:

cd /mnt/SSD with the cd standing for Change Directory

You've been benchmarking your boot SSD, although your ZFS subsystem is able to pipe 1.5GB/s of throughput, at least briefly, so the throttle lies elsewhere.

Onto your gstat -dp request for the raidz2 pool
At start:
up to 30 ms/w

Once write speeds drops to around the 25/35MB/s:
up to 269ms/w (not all drives at once, and the drive recovers back down to 3-4ms/w whilst another drive spikes) %b up to 110% on drives, but again drops and recovers

I'd suspect the drives are still in the process of reshingling themselves. I don't believe the Seagate SMR drives support TRIM functionality to "clean the slate" so to speak.

Is it possible to connect the drives to the SATA ports directly?
 

Anthony-1

Dabbler
Joined
Sep 30, 2022
Messages
17
Ah o
cd /mnt/SSD with the cd standing for Change Directory

You've been benchmarking your boot SSD, although your ZFS subsystem is able to pipe 1.5GB/s of throughput, at least briefly, so the throttle lies elsewhere.



I'd suspect the drives are still in the process of reshingling themselves. I don't believe the Seagate SMR drives support TRIM functionality to "clean the slate" so to speak.

Is it possible to connect the drives to the SATA ports directly

cd /mnt/SSD with the cd standing for Change Directory

You've been benchmarking your boot SSD, although your ZFS subsystem is able to pipe 1.5GB/s of throughput, at least briefly, so the throttle lies elsewhere.



I'd suspect the drives are still in the process of reshingling themselves. I don't believe the Seagate SMR drives support TRIM functionality to "clean the slate" so to speak.

Is it possible to connect the drives to the SATA ports directly?
Ah OK,

I will run it again when I'm home with the correct pool.

I can take one drive off and plug into the single spare onboard sata and create a new pool. I saw an option for trim, I'll see if it let's me do it. Are you thinking then when they have finished reshingling themselves speeds will improve?

I'll post both test results of the single hdd on the onboard later on. Unless a basic pcie to 4 port sata is any good? I have a spare of those
 
Top