Well, Proxmox isn't known to work well with FreeNAS; ESXi is the hypervisor that's expected to work.
First, a few suggestions for better chance of success:
Make sure you are preallocating memory and set CPU affinity and CPU reservation. I agree with
@HoneyBadger that four or six makes more sense. A dual E5-2620v3 would only have 12 cores/24 threads, and FreeNAS doesn't need a lot for basic operations.
Your choice of dd test is a bit confusing.
/dev/zero should be highly compressible and should be able to feed in at hundreds of MBytes/sec, but if you don't specify a block size, it is going to be doing a crappy block size and incurring a huge amount of kernel/userland boundary crossings. Speed will be dependent in part on the compression chosen.
For example, on a pool with a gzip-9 setting, ESXi on an E3-1230, two cores, if you do what you did:
# dd if=/dev/zero of=testxxx status=progress
7095446528 bytes (7095 MB, 6767 MiB) transferred 127.005s, 56 MB/s
But you're burning almost a quarter of a MILLION syscalls per second:
Code:
# vmstat 1
procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id
1 1 0 2.0T 763M 90 0 0 10 159 112 0 0 182 121 15 4 5 92
1 1 0 2.0T 763M 2 0 0 0 0 124 158 156 651 230641 4229 1 52 47
1 1 0 2.0T 763M 0 0 0 0 0 124 131 139 655 230808 4229 1 54 45
2 1 0 2.0T 763M 0 0 0 0 0 124 167 151 708 236336 4418 2 50 47
1 1 0 2.0T 763M 30 0 0 0 1 124 229 220 997 231643 6767 2 53 44
1 1 0 2.0T 763M 0 0 0 0 0 128 167 165 668 238116 4804 2 51 47
1 1 0 2.0T 763M 0 0 0 0 0 124 129 143 598 236131 4208 1 52 47
1 1 0 2.0T 763M 0 0 0 0 0 127 141 131 662 228287 4289 2 52 47
1 1 0 2.0T 763M 386 0 0 71 0 124 163 149 666 237927 6843 3 52 45
3 1 0 2.0T 763M 13652 0 0 0 12337 128 77 74 322 228304 2813 11 71 18
3 1 0 2.0T 758M 9192 0 0 0 24963 124 177 194 782 196150 6866 10 90 0
1 1 0 2.0T 763M 7376 0 0 0 18690 126 155 146 619 201256 4495 7 81 13
1 1 0 2.0T 763M 60 0 0 0 0 128 159 164 727 200882 4367 1 52 47
1 1 0 2.0T 763M 6 0 0 0 0 124 138 121 534 208638 3989 1 52 47
1 1 0 2.0T 763M 0 0 0 0 0 125 117 127 526 236171 3944 1 52 47
While if you set a larger block size,
# dd if=/dev/zero of=testxxx bs=1048576 status=progress
45634027520 bytes (46 GB, 43 GiB) transferred 44.709s, 1021 MB/s
you burn much less syscall and you will also see cxw get more involved because there's more time spent running compression in the kernel:
Code:
procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id
0 1 0 2.0T 698M 0 0 0 0 0 146 145 144 626 577 6747 0 18 82
0 1 0 2.0T 698M 2 0 0 0 0 146 131 137 592 2527 6210 0 23 77
0 1 0 2.0T 697M 3382 0 0 0 2082 146 324 328 1420 5801 13848 5 38 56
0 1 0 2.0T 697M 2 0 0 0 1 146 171 158 719 2039 10499 0 26 74
0 1 0 2.0T 697M 501 0 0 150 0 146 181 183 749 14399 12806 0 30 69
1 1 0 2.0T 697M 2 0 0 0 0 146 184 168 753 986 5238 0 17 83
0 1 0 2.0T 696M 2504 0 0 0 1317 146 164 172 801 4392 8627 3 30 66
0 1 0 2.0T 698M 861 0 0 0 965 146 404 394 1760 1015 13138 0 7 93
Meanwhile, if you do this on something with lz4, this is a slower system (E5-2697v2, two cores allocated) but it blows zeroes at 2GBytes/sec because lz4 does great with zeroes.
# dd if=/dev/zero of=testxxx bs=10485760 count=5000
5000+0 records in
5000+0 records out
52428800000 bytes transferred in 27.312389 secs (1919597753 bytes/sec)
If you are actually intending to test your DISK I/O, first create a random file by pulling a bunch of crap from /dev/zero. THE SPEED OF THIS IS NOT OF INTEREST for the purposes of testing disk I/O.
# dd if=/dev/random of=testxxx bs=1048576 count=1000
1000+0 records in
1000+0 records out
1048576000 bytes transferred in 6.794553 secs (154325966 bytes/sec)
The point is to gather some incompressible data that is sufficiently large but will fit in ARC.
Now you do your write test:
# cat testxxx{,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,} > testyyy &
# zpool iostat freenas1 1
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
freenas1 63.1T 56.9T 194 642 23.6M 3.72M
freenas1 63.1T 56.9T 156 2.38K 19.1M 301M
freenas1 63.1T 56.9T 88 2.50K 10.4M 318M
freenas1 63.1T 56.9T 159 2.64K 19.2M 335M
freenas1 63.1T 56.9T 77 2.32K 9.43M 295M
freenas1 63.1T 56.9T 77 2.40K 8.58M 305M
freenas1 63.1T 56.9T 139 2.43K 17.2M 309M
freenas1 63.1T 56.9T 93 2.37K 11.1M 301M
freenas1 63.1T 56.9T 178 2.09K 20.3M 199M
freenas1 63.1T 56.9T 29 386 2.77M 12.8M
freenas1 63.1T 56.9T 59 573 7.37M 71.7M
freenas1 63.1T 56.9T 138 2.68K 16.5M 339M
freenas1 63.1T 56.9T 144 2.53K 17.1M 321M
freenas1 63.1T 56.9T 149 2.46K 17.4M 313M
freenas1 63.1T 56.9T 209 2.89K 25.5M 368M
freenas1 63.1T 56.9T 119 2.24K 14.1M 284M
freenas1 63.1T 56.9T 125 2.34K 14.7M 298M
That's OK for lz4 and a single RAIDZ3 vdev, IMO.