Nice read. He seems to make a valiant effort to do a thorough job, but there's a few things I don't like about the post:
1. He uses bonnie++ but doesn't provide the actual parameters he used.
2. Putting the L2ARC and SLOG on the same device is just stupid. The L2ARC is designed to only be used when you aren't using the disk for other activities. Surprise! An SLOG, by its very nature and if used properly (which is the only time you should have an SLOG at all) would mean you're locking out the L2ARC. This is a noob mistake and pretty much proves he's not the expert ZFS user he might sound like from his read.
3. He's running RAIDZ2. Most people doing iSCSI are, by the nature of iSCSI, going to be IOPS bottlenecked. This also means that anyone that understands ZFS would go with mirrors and not RAIDZ(x). So he literally tested a configuration that is not likely to ever reflect real-world values except to people that don't really understand ZFS (or think they are going to save money by going with RAIDZ2 over mirrors).
4. Things like block size, file-based versus zvol-based have a TREMENDOUS impact on throughput. zvol is a guaranteed win over file-based when doing benchmarks, yet he made no mention of which he used.
5. We know from experience that pretty much every SSD out there except the most expensive Samsungs and the Intel S3700/330 don't make great SLOGs. He used a Corsair. :(
Overall, it seems like he tried really hard, but is still missing so many fundamental aspects of ZFS that he really didn't prove a thing because his configuration would never ever be used in a production environment, except by someone that doesn't know any better. Unfortunately, this also literally invalidates all of his results because a given percent improvement for some non-standard non-proper configuration means absolutely nothing.
Very well written by him though and I give him props for trying.
Just for the record, I don't try to benchmark ZFS any more than I have to because it is a disaster if you aren't a ZFS "God" (yes, with a capital-G). I don't consider myself to be that level yet. I can find obvious flaws in benchmarking, but I bet if I did my own benchmarking I'd probably screw it up too.