I'm running into a performance problem and have reduced it down to a simpler case than I saw it, but the core issue is that if a zvol on one pool is maxed out on writes, then a zvol on a different pool is affected.
I have two pools, bulk and nvme. bulk is a 8x6TB SAS raidz2 spinning drives, nvme is a mirror of two 2TB nvme drives.
The only thing on the nvme pool is a zvol that a VM is using.
I have a zvol on bulk where I'm importing (dd over ssh) from a different server, and it maxes out the bulk drives (~200MB/s), and after a little while (less than a minute) any disk access on the VM is very slow, one example:
It's bad enough that doing much on the VM will cause the sata controller (in the VM) to reset and the filesystem to be remounted read only.
Doing things in /mnt/nvme performs as expected, which is what led me to believe it was a zvol bottleneck.
This is on TrueNAS-SCALE-22.02-RC.1, but also occurred in BETA.2 (and in proxmox fwiw).
Server is running everything as defaults, has 2x8/16 cores, 96G of memory.
Please let me know if there's any more details I can provide or tests I can run.
Edit: after rebooting that VM, turns out that it corrupted the zvol and it can't even boot, which explains why I had some VMs that this happened to in the past week.
Thank you!
I have two pools, bulk and nvme. bulk is a 8x6TB SAS raidz2 spinning drives, nvme is a mirror of two 2TB nvme drives.
The only thing on the nvme pool is a zvol that a VM is using.
I have a zvol on bulk where I'm importing (dd over ssh) from a different server, and it maxes out the bulk drives (~200MB/s), and after a little while (less than a minute) any disk access on the VM is very slow, one example:
Code:
root@testing:~# time (touch asdfasdf ; rm asdfasdf) real 0m17.363s user 0m0.002s sys 0m0.003s root@testing:~#
It's bad enough that doing much on the VM will cause the sata controller (in the VM) to reset and the filesystem to be remounted read only.
Doing things in /mnt/nvme performs as expected, which is what led me to believe it was a zvol bottleneck.
This is on TrueNAS-SCALE-22.02-RC.1, but also occurred in BETA.2 (and in proxmox fwiw).
Server is running everything as defaults, has 2x8/16 cores, 96G of memory.
Please let me know if there's any more details I can provide or tests I can run.
Edit: after rebooting that VM, turns out that it corrupted the zvol and it can't even boot, which explains why I had some VMs that this happened to in the past week.
Thank you!
Last edited: