Poor Peformance FreeNAS as Proxmox Guest

cwingert

Dabbler
Joined
Feb 21, 2021
Messages
10
FreeNAS installed as a proxmox guest. Dual 2620v3 (16 threads passed through), 32 GB RAM, SAS3 controller passed though.

81:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)

truenas% dd if=/dev/zero of=test status=progress
1284685824 bytes (1285 MB, 1225 MiB) transferred 74.041s, 17 MB/s

Any idea why the performance is so slow?

Thanks in advance!
 
Last edited:

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Well, Proxmox isn't known to work well with FreeNAS; ESXi is the hypervisor that's expected to work.

First, a few suggestions for better chance of success:

Make sure you are preallocating memory and set CPU affinity and CPU reservation. I agree with @HoneyBadger that four or six makes more sense. A dual E5-2620v3 would only have 12 cores/24 threads, and FreeNAS doesn't need a lot for basic operations.

Your choice of dd test is a bit confusing.

/dev/zero should be highly compressible and should be able to feed in at hundreds of MBytes/sec, but if you don't specify a block size, it is going to be doing a crappy block size and incurring a huge amount of kernel/userland boundary crossings. Speed will be dependent in part on the compression chosen.

For example, on a pool with a gzip-9 setting, ESXi on an E3-1230, two cores, if you do what you did:

# dd if=/dev/zero of=testxxx status=progress
7095446528 bytes (7095 MB, 6767 MiB) transferred 127.005s, 56 MB/s

But you're burning almost a quarter of a MILLION syscalls per second:

Code:
# vmstat 1
procs  memory       page                    disks     faults         cpu
r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs us sy id
1 1 0 2.0T  763M    90   0   0  10   159  112   0   0  182   121    15  4  5 92
1 1 0 2.0T  763M     2   0   0   0     0  124 158 156  651 230641  4229  1 52 47
1 1 0 2.0T  763M     0   0   0   0     0  124 131 139  655 230808  4229  1 54 45
2 1 0 2.0T  763M     0   0   0   0     0  124 167 151  708 236336  4418  2 50 47
1 1 0 2.0T  763M    30   0   0   0     1  124 229 220  997 231643  6767  2 53 44
1 1 0 2.0T  763M     0   0   0   0     0  128 167 165  668 238116  4804  2 51 47
1 1 0 2.0T  763M     0   0   0   0     0  124 129 143  598 236131  4208  1 52 47
1 1 0 2.0T  763M     0   0   0   0     0  127 141 131  662 228287  4289  2 52 47
1 1 0 2.0T  763M   386   0   0  71     0  124 163 149  666 237927  6843  3 52 45
3 1 0 2.0T  763M 13652   0   0   0 12337  128  77  74  322 228304  2813 11 71 18
3 1 0 2.0T  758M  9192   0   0   0 24963  124 177 194  782 196150  6866 10 90  0
1 1 0 2.0T  763M  7376   0   0   0 18690  126 155 146  619 201256  4495  7 81 13
1 1 0 2.0T  763M    60   0   0   0     0  128 159 164  727 200882  4367  1 52 47
1 1 0 2.0T  763M     6   0   0   0     0  124 138 121  534 208638  3989  1 52 47
1 1 0 2.0T  763M     0   0   0   0     0  125 117 127  526 236171  3944  1 52 47


While if you set a larger block size,
# dd if=/dev/zero of=testxxx bs=1048576 status=progress
45634027520 bytes (46 GB, 43 GiB) transferred 44.709s, 1021 MB/s

you burn much less syscall and you will also see cxw get more involved because there's more time spent running compression in the kernel:

Code:
procs  memory       page                    disks     faults         cpu
r b w  avm   fre   flt  re  pi  po    fr   sr ad0 ad1   in    sy    cs us sy id
0 1 0 2.0T  698M     0   0   0   0     0  146 145 144  626   577  6747  0 18 82
0 1 0 2.0T  698M     2   0   0   0     0  146 131 137  592  2527  6210  0 23 77
0 1 0 2.0T  697M  3382   0   0   0  2082  146 324 328 1420  5801 13848  5 38 56
0 1 0 2.0T  697M     2   0   0   0     1  146 171 158  719  2039 10499  0 26 74
0 1 0 2.0T  697M   501   0   0 150     0  146 181 183  749 14399 12806  0 30 69
1 1 0 2.0T  697M     2   0   0   0     0  146 184 168  753   986  5238  0 17 83
0 1 0 2.0T  696M  2504   0   0   0  1317  146 164 172  801  4392  8627  3 30 66
0 1 0 2.0T  698M   861   0   0   0   965  146 404 394 1760  1015 13138  0  7 93


Meanwhile, if you do this on something with lz4, this is a slower system (E5-2697v2, two cores allocated) but it blows zeroes at 2GBytes/sec because lz4 does great with zeroes.

# dd if=/dev/zero of=testxxx bs=10485760 count=5000
5000+0 records in
5000+0 records out
52428800000 bytes transferred in 27.312389 secs (1919597753 bytes/sec)

If you are actually intending to test your DISK I/O, first create a random file by pulling a bunch of crap from /dev/zero. THE SPEED OF THIS IS NOT OF INTEREST for the purposes of testing disk I/O.

# dd if=/dev/random of=testxxx bs=1048576 count=1000
1000+0 records in
1000+0 records out
1048576000 bytes transferred in 6.794553 secs (154325966 bytes/sec)

The point is to gather some incompressible data that is sufficiently large but will fit in ARC.

Now you do your write test:

# cat testxxx{,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,} > testyyy &
# zpool iostat freenas1 1
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
freenas1 63.1T 56.9T 194 642 23.6M 3.72M
freenas1 63.1T 56.9T 156 2.38K 19.1M 301M
freenas1 63.1T 56.9T 88 2.50K 10.4M 318M
freenas1 63.1T 56.9T 159 2.64K 19.2M 335M
freenas1 63.1T 56.9T 77 2.32K 9.43M 295M
freenas1 63.1T 56.9T 77 2.40K 8.58M 305M
freenas1 63.1T 56.9T 139 2.43K 17.2M 309M
freenas1 63.1T 56.9T 93 2.37K 11.1M 301M
freenas1 63.1T 56.9T 178 2.09K 20.3M 199M
freenas1 63.1T 56.9T 29 386 2.77M 12.8M
freenas1 63.1T 56.9T 59 573 7.37M 71.7M
freenas1 63.1T 56.9T 138 2.68K 16.5M 339M
freenas1 63.1T 56.9T 144 2.53K 17.1M 321M
freenas1 63.1T 56.9T 149 2.46K 17.4M 313M
freenas1 63.1T 56.9T 209 2.89K 25.5M 368M
freenas1 63.1T 56.9T 119 2.24K 14.1M 284M
freenas1 63.1T 56.9T 125 2.34K 14.7M 298M

That's OK for lz4 and a single RAIDZ3 vdev, IMO.
 

cwingert

Dabbler
Joined
Feb 21, 2021
Messages
10
Thanks for the detailed response!

Forgot to mention that I had turned off compression. I also setup an empty pool with 8 - 8 TB drives for testing.

AFA the test results WRT to
# dd if=/dev/random of=testxxx bs=1048576 count=1000
# cat testxxx{,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,} > testyyy &


capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
volume2 16.3G 58.2T 0 1.19K 0 77.9M
volume2 16.3G 58.2T 0 1.23K 0 78.8M
volume2 16.3G 58.2T 0 1.33K 0 89.8M
volume2 16.3G 58.2T 0 1.47K 0 92.8M
volume2 16.3G 58.2T 0 1.24K 0 82.6M
volume2 16.3G 58.2T 0 1.26K 0 87.9M
volume2 16.3G 58.2T 0 1.47K 0 92.2M
volume2 16.3G 58.2T 0 1.38K 0 91.0M
volume2 16.3G 58.2T 0 1.09K 0 65.0M
volume2 16.3G 58.2T 0 1.32K 0 86.7M
volume2 16.3G 58.2T 0 1.57K 0 98.4M
volume2 16.3G 58.2T 0 1.54K 0 98.6M
volume2 16.3G 58.2T 0 1.52K 0 95.3M
volume2 16.3G 58.2T 0 1.28K 0 84.3M
volume2 16.3G 58.2T 0 128 0 5.27M
volume2 16.3G 58.2T 0 1.29K 0 84.4M
volume2 16.3G 58.2T 0 1.31K 0 83.7M
volume2 16.3G 58.2T 0 1.35K 0 88.3M
volume2 16.3G 58.2T 0 1.66K 0 104M
volume2 16.3G 58.2T 0 1.41K 0 93.3M
volume2 16.3G 58.2T 0 1.39K 0 87.9M
volume2 16.3G 58.2T 0 1.50K 0 93.4M
volume2 16.3G 58.2T 0 1.34K 0 86.4M
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
volume2 16.3G 58.2T 0 1.38K 0 88.6M
volume2 16.3G 58.2T 0 1016 0 68.1M
volume2 16.3G 58.2T 0 1.06K 0 68.3M
volume2 16.3G 58.2T 0 1.07K 0 66.0M
volume2 16.3G 58.2T 0 1.10K 0 69.3M
volume2 16.3G 58.2T 0 910 0 56.6M
volume2 16.3G 58.2T 0 997 0 65.2M
volume2 16.3G 58.2T 0 920 0 58.0M
volume2 16.3G 58.2T 0 1.03K 0 61.6M
volume2 16.3G 58.2T 0 1.09K 0 64.5M
volume2 16.3G 58.2T 0 1.12K 0 65.5M
volume2 16.3G 58.2T 0 894 0 54.6M
volume2 16.3G 58.2T 0 1.27K 0 78.5M
volume2 16.3G 58.2T 0 1.26K 0 76.4M
volume2 16.3G 58.2T 0 1.04K 0 67.9M
volume2 16.3G 58.2T 0 1.33K 0 79.3M
volume2 16.3G 58.2T 0 613 0 41.4M
volume2 16.3G 58.2T 0 121 0 8.14M
volume2 16.3G 58.2T 0 156 0 9.10M
volume2 16.3G 58.2T 0 818 0 52.6M
volume2 16.3G 58.2T 0 1.11K 0 71.3M
volume2 16.3G 58.2T 0 978 0 62.8M
volume2 16.3G 58.2T 0 1.21K 0 76.1M
 

cwingert

Dabbler
Joined
Feb 21, 2021
Messages
10
Also I measured the performance of a single 8 TB drive under Linux.

sudo dd if=/dev/zero of=test status=progress
1746500966400 bytes (1.7 TB, 1.6 TiB) copied, 7410 s, 236 MB/s^C
3411438948+0 records in
3411438948+0 records out

I also matched this performance with a copy semantic as in the previous two posts.

I don't quite understand how so much is left on the table.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
It's hard to say. The first place to start would probably be to de-virtualize the FreeNAS and see how it performs. This is also a key concept in disaster recovery. See:

https://www.truenas.com/community/t...ide-to-not-completely-losing-your-data.12714/

Now I'm going to *add* that FreeNAS is not known to work {well, at all} on other hypervisors other than ESXi. Before anyone tries to bite Grinchy's head off, please note that I'm entirely pragmatic about these things, and my primary interest is guiding people towards what IS known to work, and that's mostly because we believe people to care about their precious data. If you want to virtualize on another platform, and you're willing to do the work and own any problems, I am 100% fine with that and it's interesting to explore.

It's probably most important to realize that sequential performance of a single drive with dd is going to be faster than ZFS on a pool. You would notice this on a bare metal ZFS install too. The single-disk test you did is therefore not that useful, and having done it under Linux is even less useful, because the interesting question it COULD have answered would have been "is there something inherent that is interfering with virtualized FreeBSD on the Proxmox platform." So you need to repeat that single-disk sequential test, but do it on FreeBSD (or FreeNAS).

What you'd be looking for is if there was a significant difference that could indicate a bottleneck of some sort in the virtualization layers.

Virtualization platforms are strange things, and just as an example for purposes of illustration, the length of the window in which a VM runs can be an issue. A virtualization host where the scheduler runs less often and the host concedes time slot due to idle can result in poor performance; these things are sort of esoteric and beyond most beginners, which is why I suggest assigning CPU affinity and CPU reservation because those are the conventional knobs that bludgeon those effects into minimization on most hypervisors.
 

cwingert

Dabbler
Joined
Feb 21, 2021
Messages
10
So I ran a few more tests of the random data copy under various configurations.

VM: ZFS on Linux 68 MB/s (same as FreeNAS on VM)
VM: Linux mdadm RAID6/ext4 198 MB/s
VM: Linux mdadm RAID0/ext4 303 MB/s

No VM:FreeNAS ZFS 111 MB/s
 
Last edited:
Top