Andres Biront
Dabbler
- Joined
- Sep 3, 2013
- Messages
- 17
Hi,
I'm having a weird behavior on my SSD pool, and I'm not sure where the issue is at. It came into my attention that I had latency spikes only on my SSD pool which serves as an iSCSI target for my VMware Cluster.
First of all, my hardware configuration.
DISCLAIMER: This is a home-lab. Before going crazy about "You fool will lose everything" take that into consideration. And I have backups =)
FreeNAS box:
Intel Xeon X3430
Intel ServerBoard S3420GPX
LSI 9210-8i IT mode
32GB DDR3-1333 ECC RAM
3 x 1Gbps Intel Ethernet
Silverstone 400W PSU
6 x 1TB WD Blue
2 x 500GB SK Hynix SSD
Pools:
2 x 3x1TB WD RAIDZ (compression, no deduplication)
2 x 500GB SDD Mirror (compression, deduplication)
Why dedup? SSD is not cheap. 500GB is not much. RAM should be enough. VMware Datastore Usage = 150GB. Pool actual usage = 50GB. I'm loving it.
Ok... so, what's the problem? Latency spikes. I don't know why. But I think It's related to the SSDs TRIMing constantly.
For example:
The RAID is actually "idle" (idle VMs are online), but it's constantly deleting something. It's been like that since day 1. So, I tought I should check if it was TRIMing and found:
kstat.zfs.misc.zio_trim.failed: 0
kstat.zfs.misc.zio_trim.unsupported: 1767
kstat.zfs.misc.zio_trim.success: 96464032
kstat.zfs.misc.zio_trim.bytes: 1062348283904
If I'm not mistaken... it trimmed 989GB. The pool was created 1 week ago. 1TB in 1 week... It's going to kill those SSDs.
Can anyone help me pin down what it's happening? Maybe dedup has problems with my config. Maybe TRIM is not functioning properly. Any ideas?
Thanks in advance.
EDIT:
Can I disable TRIM entirely? I need to protect the SSDs from certain death!
Forgot a crucial part... FreeNAS 9.10u4
EDIT 2:
For troubleshooting I tried to disable TRIM by setting:
sysctl -w vfs.zfs.trim.enabled=0
But it said "Tunable values are set in /boot/loader.conf"
So I edited loader.conf with that line. But it does not apply. I also created a Tunable from the GUI, but it didn't apply neither. When I query that sysctl I always get:
# sysctl vfs.zfs.trim.enabled
vfs.zfs.trim.enabled: 1
But now that I have the system cleanly booted up without iSCSI traffic (since my servers are awaiting for a manual rescan) I checked the status of both disks and they weren't issuing any delete commands, and "trim.bytes" wasn't growing.
I have ESXi 6.5 with VMFS 6, so I thought maybe the auto UNMAP operations were causing issues, so I proceed to disable UNMAP (automatic space reclamation). But the moment I power up a VM, delete operations start rising again. It's been online for less than an hour with only 1 VM powered UP and it already grew to 7GB (!)
I'm having a weird behavior on my SSD pool, and I'm not sure where the issue is at. It came into my attention that I had latency spikes only on my SSD pool which serves as an iSCSI target for my VMware Cluster.
First of all, my hardware configuration.
DISCLAIMER: This is a home-lab. Before going crazy about "You fool will lose everything" take that into consideration. And I have backups =)
FreeNAS box:
Intel Xeon X3430
Intel ServerBoard S3420GPX
LSI 9210-8i IT mode
32GB DDR3-1333 ECC RAM
3 x 1Gbps Intel Ethernet
Silverstone 400W PSU
6 x 1TB WD Blue
2 x 500GB SK Hynix SSD
Pools:
2 x 3x1TB WD RAIDZ (compression, no deduplication)
2 x 500GB SDD Mirror (compression, deduplication)
Why dedup? SSD is not cheap. 500GB is not much. RAM should be enough. VMware Datastore Usage = 150GB. Pool actual usage = 50GB. I'm loving it.
Ok... so, what's the problem? Latency spikes. I don't know why. But I think It's related to the SSDs TRIMing constantly.
For example:

The RAID is actually "idle" (idle VMs are online), but it's constantly deleting something. It's been like that since day 1. So, I tought I should check if it was TRIMing and found:
kstat.zfs.misc.zio_trim.failed: 0
kstat.zfs.misc.zio_trim.unsupported: 1767
kstat.zfs.misc.zio_trim.success: 96464032
kstat.zfs.misc.zio_trim.bytes: 1062348283904
If I'm not mistaken... it trimmed 989GB. The pool was created 1 week ago. 1TB in 1 week... It's going to kill those SSDs.
Can anyone help me pin down what it's happening? Maybe dedup has problems with my config. Maybe TRIM is not functioning properly. Any ideas?
Thanks in advance.
EDIT:
Can I disable TRIM entirely? I need to protect the SSDs from certain death!
Forgot a crucial part... FreeNAS 9.10u4
EDIT 2:
For troubleshooting I tried to disable TRIM by setting:
sysctl -w vfs.zfs.trim.enabled=0
But it said "Tunable values are set in /boot/loader.conf"
So I edited loader.conf with that line. But it does not apply. I also created a Tunable from the GUI, but it didn't apply neither. When I query that sysctl I always get:
# sysctl vfs.zfs.trim.enabled
vfs.zfs.trim.enabled: 1
But now that I have the system cleanly booted up without iSCSI traffic (since my servers are awaiting for a manual rescan) I checked the status of both disks and they weren't issuing any delete commands, and "trim.bytes" wasn't growing.
I have ESXi 6.5 with VMFS 6, so I thought maybe the auto UNMAP operations were causing issues, so I proceed to disable UNMAP (automatic space reclamation). But the moment I power up a VM, delete operations start rising again. It's been online for less than an hour with only 1 VM powered UP and it already grew to 7GB (!)
Last edited: