Yep. I could have sworn you and I discussed this in PM.
It was just like the drop in speed I saw back in February, but that was with my entire pool(even dd tests showed crappy performance).
iSCSI is kind of a finicky thing due to a variety of issues, including block size issues, fragmentation, etc. As a pool ages, an iSCSI extent tends to become highly fragmented due to the need to be rewriting inconveniently sized chunks. In the ZFS model, ARC and L2ARC are the answer to fragmentation (put the entire working set in ARC/L2ARC!), and/or "stop everything, make a fresh copy, then restart it" to forcibly reorganize the blocks on disk.
I have long said that the normal "80%" pool fill limit for ZFS needs to be SUBSTANTIALLY less for iSCSI, 60% or less, in a potentially hopeless bid to allow the system to more reasonably allocate adjacent chunks. Put simply, if you write two things, they might be related, and so you might luck out when reading them too. This could at least slow the inevitable increase of fragmentation down.
You can see the impressive damage done to throughput as a pool fills, see Fig. 4 of this presentation. One might reasonably conclude from this that my 60% number is insanely optimistic, but I will point out that the use pattern being tested is rather closer to pathological than it is to average use cases. But is very telling that the chart shoots down from 6000 at 10% full to ~3000 at ~20% full.
I guess that I wouldn't expect to see a sudden and catastrophic performance falloff on an established pool without some significant trigger event, but it is possible that one happened without your being aware of it. For example, writing large portions of the disk, such as "make world" on a FreeBSD system, would be hell on iSCSI + ZFS. But your initial description of the problem still sounds like something else may be amiss.
As you have mentioned "...put the entire working set in ARC/L2ARC..." and if SSDs are being used for ARC & L2ARC. Can I just used SSD directly as data storage/zpool to solve this performance issue?
Based on above comment:
- SSD as data storage/zpool should help in write performance (SSD writes faster than conventional HDD)?
- Would I gain any performance if FreeNAS presents the data storage/zpool to ESXi using NFS (and avoid using iSCSI)?
I'm having trouble coming up with a reason that mirrored (not RAIDZ) SSD would not be equivalent or better than L2ARC in practice.
SSD can potentially write faster than conventional HDD. However, the ZFS transaction group write mechanism is still likely to be a bottleneck. Choice of a good SLOG device is probably still required.
NFS and iSCSI are roughly equivalent when you look at the big picture as far as reading and writing blocks, which is fundamentally what ESXi is doing. They differ in implementation details.
The remainder of your questions are answered elsewhere, including the two virtualization stickys.
Sound like you are suggesting the following:
1) Assume 500GB SSD (good for read/write) for zpool (non RAID)
2) Assign just enough RAM for FreeNAS : 7GB (6 GB ZFS baseline + 1GB/TB of storage)
3) Buy a SLOG device for sync write enhancement
Please advice on what kind of SLOG device I should get?
- a simple SATA 6GB/s SSD drive (e.g. samsung 840 pro), or
- a dedicated PCIe SSD (e.g. OCZ RevoDrive 3 - PCI Express)
- or it MUST be **SUPER CAPACITOR** (e.g. samsung SM1625 as stated in link below)
http://www.samsung.com/global/business/semiconductor/Downloads/CeBIT_DRAM_SSD_Synopsis_Customers.pdf
It took me more than 45 minutes to move 15GB of VM to zpool (based on conventional HDD) via NFS.
The remainder of your questions are answered elsewhere, including the two virtualization stickys.