Throughput not scaling up with added VDEV

FalconX

Cadet
Joined
Jun 15, 2022
Messages
3
I originally set the server up with 1 RaidZ2 VDEV of 6 Drives, throughput over SMB was about 180MB/s. Happy with that, considering that’s about a single drive’s performance. After much reading and testing around increasing performance, I understood the logical route was additional VDEV’s would increase throughput as a Pool’s data is striped across vdevs. So, I added another VDEV.

Shocker to me was no discernible increase in throughput. I’m seeing 180 – 200MB/s now, where I don’t believe it ever went above 180 MB/s previously. However, I expected a little closer 360ish (understanding it may not exactly be double due to overhead). My use-case is transferring very large files (20GB – 80GB) for archive. Files sit there and will be read, but rarely deleted if ever. I prefer to retain the 2 VDEV RaidZ2 setup and track down the route issue rather than moving to something like striped mirrors. I do believe there's an issue somewhere, I just can’t figure out where. I have also searched online extensively but I’m also learning Truenas, posting in a forum is always my last resort as not to waste anyone’s time. Any insight would be greatly appreciated.


I’ve also tested an SSD cache drive, no impact at all so I’ve since removed it.

  • Note: Currently running as a Proxmox VM, but tested same results on bare metal
  • Version: TrueNAS-12.0-U8.1
  • Motherboard: Supermicro X10SDV-4C-TLN2F
  • CPU: Intel Xeon D-1520 4C/8T 2.2Ghz
  • RAM: 84GB ECC @ 2133Mhz (Proxmox has 128GB total)
  • HBA: LSI SAS 9300-16I 12GB/S SATA+SAS (PCI Pass-through)
  • POOL: 2 VDEVS of 6x14TB SATA HDD’s in RaidZ2
  • HDD: 4 Seagate IronWolf Pro, 2 WD WD140EDGZ, 6 Seagate Exos – all are CMR with 14TB capacity
  • NIC: onboard Intel 10G (PCI Pass-through)











DD results
Code:
root@truenas[/mnt/Tank1/Media]# dd if=/dev/zero of=testfile bs=1M count=1k
1073741824 bytes transferred in 0.601700 secs (1784513840 bytes/sec)

root@truenas[/mnt/Tank1/Media]# dd if=/dev/zero of=testfile2 bs=1M count=4k
4294967296 bytes transferred in 3.761309 secs (1141881115 bytes/sec)

root@truenas[/mnt/Tank1/Media]# dd if=/dev/zero of=testfile bs=1M count=10k
10737418240 bytes transferred in 19.120238 secs (561573461 bytes/sec)

root@truenas[/mnt/Tank1/Media]# dd if=/dev/zero of=testfile bs=1M count=100k
107374182400 bytes transferred in 223.981461 secs (479388705 bytes/sec)



Pool
Code:
root@truenas[~]# zpool status Tank1
  pool: Tank1
 state: ONLINE
  scan: scrub repaired 0B in 05:47:35 with 0 errors on Sat Jun  4 02:24:38 2022
config:

        NAME                                            STATE     READ WRITE CKSUM
        Tank1                                           ONLINE       0     0 0
          raidz2-0                                      ONLINE       0     0 0
            gptid/33050965-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/3152ba21-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/3270afe1-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/32a01d03-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/336a5d2f-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/3660e7b0-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
          raidz2-1                                      ONLINE       0     0 0
            gptid/35024d99-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/349cc1c0-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/3132514a-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
            gptid/35beb2a8-df6d-11ec-af81-d7fd7bbbde8b  ONLINE       0     0 0
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
How are you testing SMB speeds?

One thing to keep in mind is that data is not automatically spanned when a new vdev is added. In other words, to get your data to actually take advantage of the new vdev, you need to re-write it to the pool.
 

FalconX

Cadet
Joined
Jun 15, 2022
Messages
3
How are you testing SMB speeds?

One thing to keep in mind is that data is not automatically spanned when a new vdev is added. In other words, to get your data to actually take advantage of the new vdev, you need to re-write it to the pool.
Observing SMB transfer speed via Nautilus in Linux (Ubuntu w/Gnome). Nautilus > copy file from internal SSD over to Truenas as SMB mount.
I've also run rsync and see same exact speeds. There's no switch, this is a direct RJ45 connection spanning about 20' over Cat6a.

Sorry, "added VDEV" needs clarifaction... I did rebuild the pool from scratch and copy data back over to the new pool.

Should also add that iperf indicates roughly full 10G throughput
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
It's likely that you have a bottleneck somewhere in the system. Not always the easiest to figure out, and 10Gbe definitely pushes hardware to the limit.

I'd suggest some tests to rule out:
  1. Issues with the testing system. Do you get the same performance with a different system? Have you tried booting into a live CD version of Linux to rule out any issues with your setup? Have you ruled out a bottleneck with that system's SSD? You can try mounting the share via shell and reading into /dev/null or writing from /dev/random?
  2. Issues with the array. Test writes from /dev/zero aren't a great indication of real-world performance because you just be testing the cache. What speeds do you get from /dev/random? What about read speeds? (Use of=/dev/null)
  3. Issues with SMB performance. Samba is notoriously single-threaded, and while most systems can easily handle 1Gbps, going to 10Gbps can stress even competent processors. What is the CPU load during a file transfer? Have you done any tuning of Samba for performance?
  4. Issues with Proxmox. You mentioned iperf, so that likely rules out any networking issues for Proxmox, but I'd want to be sure.
  5. Disk load. When you're copying files, what's the load on the drives? It's possible that you're just saturating the drives, and you just can't get any more performance out of them. If that were the case, then I'd expect another vdev to increase performance, so I'd put this in the "unlikely" camp.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
Just taking the subject...

Throughput isn't what is impacted by adding VDEVs (IOPS is).

And in any case, if you add a VDEV to an existing pool with data already on the older VDEVs, there will be reasons that prevent the writes from spreading evenly across all VDEVs and maybe will even prefer the new VDEV (at least a bit), so could even make it worse at least temporarily unless you do something to rebalance (copy/use the script/backup/wipe/restore).
 

Dice

Wizard
Joined
Dec 11, 2015
Messages
1,410
Issues with Proxmox.
Note: Currently running as a Proxmox VM, but tested same results on bare metal
I'm happy you've gone through the mess to test this thoroughly without Proxmox in the mix.
I'm an avid Proxmox user for 7years, and generally big fan of virtualization. Here's a little user experience.

I found myself having loads of weird performance issues as soon as I attempted for higher performance than saturating 1GbE.
Anything from increased CPU usage at certain tasks, to benchmarks that seemed to cap out at ~250mb/s on an all SSD array that should 'theoretically' at least 4x that. What seemed to a controller card limitation (same performance from 1 drive, as ~4), turned out not to be - and the list goes on. Gremlings everywhere.
My conclusion from that was - it works absolutely fine virtualized, but once striving for more performance, things get unnecessarily weird.
When I got rid of proxmox in that box, and put TN on bare metal, performance shine like never before.

Looking forward to reading about your findings and solution :)
 

FalconX

Cadet
Joined
Jun 15, 2022
Messages
3
It's likely that you have a bottleneck somewhere in the system. Not always the easiest to figure out, and 10Gbe definitely pushes hardware to the limit.

I'd suggest some tests to rule out:
  1. Issues with the testing system. Do you get the same performance with a different system? Have you tried booting into a live CD version of Linux to rule out any issues with your setup? Have you ruled out a bottleneck with that system's SSD? You can try mounting the share via shell and reading into /dev/null or writing from /dev/random?
  2. Issues with the array. Test writes from /dev/zero aren't a great indication of real-world performance because you just be testing the cache. What speeds do you get from /dev/random? What about read speeds? (Use of=/dev/null)
  3. Issues with SMB performance. Samba is notoriously single-threaded, and while most systems can easily handle 1Gbps, going to 10Gbps can stress even competent processors. What is the CPU load during a file transfer? Have you done any tuning of Samba for performance?
  4. Issues with Proxmox. You mentioned iperf, so that likely rules out any networking issues for Proxmox, but I'd want to be sure.
  5. Disk load. When you're copying files, what's the load on the drives? It's possible that you're just saturating the drives, and you just can't get any more performance out of them. If that were the case, then I'd expect another vdev to increase performance, so I'd put this in the "unlikely" camp.

#1 - "Live CD" inspired me to also check Windows 10. So, on the client PC I observed Windows 10 speeding along at 500 - 700MB/s, Ubuntu 22.04 live does about 145MB/s , & Debian 11 Live does 220MB/s. This is all on the same client PC with Network adapter being a aoc-stg-i2t PCIE card with intel X540 chipset.

Seems pretty clear this is a Linux issue on the client side. I'll have to explore driver tweaks or a more compatible network adapter.

Thanks for the pointers.
 

Nick2253

Wizard
Joined
Apr 21, 2014
Messages
1,633
Throughput isn't what is impacted by adding VDEVs (IOPS is).
Indeed, this is correct. However, throughput comes from drives, and so in this case, adding another vdev by way of adding more drives should increase throughput.

Assuming all other things equal, if you hold the number of drives constant, then reconfiguring a pool to have more vdevs will increase the IOPS performance over those same drives arranged in fewer vdevs.
 
Top