Yeah this is what I suspected. Thanks for confirming it. Sounds like heavy iSCSI write loads for a datastore are not a good fit for ZFS?
They're fine, but they have to be resourced appropriately, which is something most people are unwilling to do.
Aren't writes cached pending commit to disk? I assumed all writes were stored in RAM and spooled to disk (i.e. "write back") as quickly as the disk could take it, with I/O performance falling off after the RAM is full.
Well, it's limited to two transaction groups. The other thing is that you are heavily dependent on the pool full percentage and fragmentation. ZFS will tend to like to allocate space contiguously, so if there is lots of free space, write speeds will be good, otherwise, it has to scan for available blocks of memory, which can be an intensive process.
I read the above article and it neatly lays out the problem. Now it all makes sense, and more or less cements the idea that using ZFS as an iSCSI SAN for a VMware datastore is probably not the best idea.
Well, it *can* be totally awesome. The thing is that you have to give it lots of resources. If you keep your occupancy rates low, like say 10-25%, and have enough RAM and L2ARC to cache your working set, you will have a tough time distinguishing your hard drive based pool from an SSD datastore. You end up with lots of fragmentation, an unavoidable thing on a CoW filesystem, but the low occupancy tends to keep writes zippy, and the large ARC/L2ARC keeps important reads zippy.
The problem here is that you need to burn resources to do this. For example, a NAS with 24 x 2TB HDD, to maintain redundancy in the face of a disk failure, needs 3-way mirrors, so you can have eight 2TB three-way mirrors, for a 16TB pool. However, if you follow the sizing I've suggested, you can really only use 1.6TB-4TB of it and get massively awesome write performance. If you couple that with 256GB of RAM and 1TB L2ARC, you will get great read performance too.
This is all just compsci trickery, trading one thing for another. As SSD's have gotten cheaper, it may not make quite as much sense to throw large amounts of hard disk to gain sufficient free space to tackle the CoW space allocation problem -- but SSD's have their own issues too.
Which now raises another question: given that NFS is more of a file-based setup, would you expect better performance there? It kind of sounds like it should but I haven't tested it yet.
You still have the fundamental problem of being a CoW filesystem. If you use NFS for VM virtual disks, you still have a lot of complexity and block rewrites going on within the files, but at least ZFS understands some of the moving bits a bit better, which can be a plus. On the other hand, iSCSI supports UNMAP (think: TRIM) which gives you a lot of the same sorts of ability to understand what data is no longer needed, which is one of the important variables when storing random access data like VM's or databases.