Virtualized FreeNAS and SMB shares is slow when accessing disks: What the dickens?

phil1c

Dabbler
Joined
Dec 28, 2019
Messages
21
I seem to be a glutton for punishment, so here I am, returned once again, with questions about pool performance.

My Setup
I have decided that I wanted to build a single Proxmox box that houses both virtualized FN along with various VMs for media sharing and testing/fun stuff. I building this box to replace a long-running Unraid box and am building it for future use with mGig/10GbE support. Specs for the PM host are:
  • Motherboard: SuperMicro X9SRA
  • CPU: Xeon e5 2660
  • RAM: 128GB 1600mhz DDR3 ECC
  • Boot disk: Samsung 860 EVO 500gb
  • Pool disk connections: LSI 9211-8i in IT mode (8 drives) & onboard (mobo) Intel SCU (4 drives)
    • ** Note: the 9211-8i is plugged into a 4x PCI-E x4 slot, so I know this will handicap me a bit, so I'm not looking for theoretical max performance (based on drive speeds and ZFS theoretical calculations), only for equivalent performance as on FN VM as between FN and Ubuntu VM.
Bullet points of my set-up progress:
  1. Proxmox installed.
  2. PCI-E passthrough setup.
  3. FreeNAS installed as VM
    1. passed 4 cores, 32gb memory, virtio NIC
    2. Both LSI HBA (20.0.0.7-IT firmware and matching BIOS) and mobo SCU passed through to FN VM
    3. pool and datasets built (2x 6 disk Raidz2, 12tb shucked easystore drives [~225MB/s read / 200MB/s per write per drive based on previous testing]).
    4. Datasets shared via SMB
  4. Ubuntu VM installed and SMB shares mounted via fstab.
  5. Both VMs share one network bridge (vmbr0) that connects them to my home network (LAN is 1Gbe all around). vmbr0 is set to MTU 9000, because in a previous setup I learned this is crucial for max throughput.
Testing
"Ok, let's see what performance I'm getting!"
  • DD in FreeNAS host
    • write to dataset (no compression, 80GB testfile [bs=2048k count=40000]): ~750MB/s
    • read: ~550MB/s.
      • hmmmm, odd. Slower reads?
    • FOR REFERENCE: These are the speeds that I'm looking for between VMs, but I would like to understand why I'm getting slower reads and how to fix it.
  • Iperf3 Between FN VM and Ubuntu VM
    • FN as server: 12.7Gbps
    • Reversed: 10.3Gbps
    • Great, network performance is good, if a smidge asymmetrical, and is not leaving the server to the LAN (again, LAN is all 1gbe).
**** This is where the trouble is:
  • File transfer in Ubuntu to/from SMB share of "Movies" dataset, compression=off.
    • Read from FN to Ubuntu: 190MB/s
      • Wait, what?
    • Delete the new local copy file, copy same file again from SMB share to Ubuntu local disk (file is now cached in ZFS ARC): ~420MB/s read.
      • Ok, so SMB isn't directly bottlenecking. Once the file is in ARC, I can transfer at near the theoretical write speed of the SSD.
    • Rename the now-local file copied previously. Write back to same folder on SMB share: 600MB/s write
      • Theoretical limit of SSD read, so that makes sense.
Maybe it's a "Files" issue?
  • Run same type of copy with new large file from same dataset (eliminate ARC from the chain) using "pv" from terminal: same results as with "Files"
  • DD in Ubuntu onto mounted SMB share of "Movies" dataset (compression still off):
    • write=505MB/s
      • smbd on FN VM hovered around 75% single-core usage during whole test, so not pegged at 100% but maybe CPU limited?
    • read back=228MB/s
      • smbd: around 68% +/-5% during whole test
I tried tossing more cores at the FreeNAS VM (Ubuntu already had 6 cores) but no change in performance.
* I recognize that, if it is somehow smbd limiting performance, more cores will do nothing as smbd is single-threaded. My thought was something else that was being done by the FreeNAS VM was hogging other resources and causing a bottleneck. One of my next troubleshooting tests is swapping in a CPU with a slightly higher base clock to see if anything changes.

EDIT: ZFS on Proxmox
Out of curiosity, I decided to remove the passthrough of the LSI and onboard SCU, import the pool directly into zfs, run a simple share on the same Movies share used above, and test performance. In doing so, I saw ~850MB/s read/ ~800MB/s write speeds directly to the pool from Proxmox, and transferring files in the same Ubuntu VM maxed out my SSD speeds on "fresh" (ie: files not accessed and thus not cached in ARC). This at least shows it's not a hardware problem.

EDIT: Additional Testing, abridged
I have also tried various combinations of the following, in no particular order:
  • SeaBIOS vs OVMF UEFI
  • i440fx vs q35
  • changed VM "cpu" type
  • Changed some parameters in FSTAB
Not one single test in the lot made an iota of difference in read-from-disk speeds from FN to Ubuntu VM.

Problems I want to fix
Out of all of that, the big issue I want to fix is:
  • Why is reading from the pool (disks, not ARC) so damn slow, compared to speeds I get from within FN itself?
    • It's not the "network" as iperf is >=10Gbps.
    • I dont think it's smbd as from ARC to VM is maxing out my SSD.
    • It's not disk access as raw speeds reading from disk from within FN are far faster (190MB/s vs 500MB/s).
    • It's not the SSD as, again, ARC to SSD maxes out the SSD write speed.
    • EDIT: It's not hardware as importing the pool into ZFS on Proxmox and sharing reached max theoretical speeds.
A secondary issue I'd love to solve is why, even on FreeNAS, are my reads from the pool slower than my writes? From testing prior to shucking, and researching on the internet prior to building, the 12tb Easystores are not SMR.




Any suggestions or troubleshooting recommendations are welcome with outstretched arms.

Cheers
 
Last edited:

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Did you check if your shucked drives had been reformatted to 520-byte sectors, instead of 512-byte sectors?
 

phil1c

Dabbler
Joined
Dec 28, 2019
Messages
21
I checked every disk via smartctl and they all report 512byte logical sectors and 4096 physical sectors. I also confirmed that the pool VDEVs (2 of them) have an ashift value of 12 so that lines up.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399

phil1c

Dabbler
Joined
Dec 28, 2019
Messages
21
lspci -v -s 06:00.0 (address if LSI) gives me:

Code:
root@alexandria[~]# lspci -v -s 06:00.0
06:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: Broadcom / LSI Device 3020
        Flags: bus master, fast devsel, latency 0, IRQ 16
        I/O ports at b000
        Memory at fb700000 (64-bit, non-prefetchable)
        Memory at fb680000 (64-bit, non-prefetchable)
        Expansion ROM at fb200000 [disabled]
        Capabilities: [50] Power Management version 3
        Capabilities: [68] Express Endpoint, MSI 00
        Capabilities: [d0] Vital Product Data
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [138] Power Budgeting <?>
        Capabilities: [150] Single Root I/O Virtualization (SR-IOV)
        Capabilities: [190] Alternative Routing-ID Interpretation (ARI)



Can you be more specific about MSI/MSI-X and it's potential impact to performance? Or at least what you may be pointing at regarding "tuners"?
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
The simplest analogy I can think of for MSI/MSI-X is it's a technology to multiplex interrupts over a single PCI signal line. MSI-X allows for more multiplexing. However, if the multiplex is too wide, the adapter and CPU may waste too much time separating the multiplexed interrupts, and routing them to the appropriate interrupt handler per device. In this case, the HBA is running in MSI-X mode, which is the widest mode. Check the man page for mps for the loader tunables. These are set in FreeNAS under System->Tunables, and only take effect on reboot. You should try disabling MSI-X to force the HBA into MSI mode, and see if this works better for you.
 
Top