It seems to me you are hitting some of the issues involved in moving literally millions of small files. You need to be thinking about setting up so your iops can address all the overhead in managing building out those masses of files. That means MANY vdevs and a FAST slog. Worry about better use of space and z3 after you have performance minimums in hand.
There are a few things that don't make sense at all. You are on a 10Gb network... but you haven't shown or verified network throughput. You haven't shown that the machine and network are even capable of moving data fast under best case scenarios. How fast does a large file move to the freenas server? If you had a small striped ssd pool i.e the lowest latency/seek possible and a reasonable slog... could you saturate 10gbe during a test? How bout 500 MBps?
You have a real budget, and if high throughput on this brutal workload is viable, you may want to look at a fusion-io or similar. Your 20-50k options are utilizing those kind of advantages.
9.2.1.7 helped CIFS out somewhat with small files. Doesn't make sense to ignore it to me... but ymmv. Samba is single threaded.. so your extra slow cores are not gonna help at all. It is going to top out early... but not this early. Even a Mini can do 350MBps (see
cyberjocks review).
I think you are dead on wrt dedupe. Maybe it is viable... but on a sub 50k box there are gonna be issues. Offline versions ala windows are interesting, but don't seem very elegant or powerful.
No interface to your raid controller/ or blink, no hot-swap... welcome to bsd. ;)
Everyone moving many small files faces the same challenges, as does every platform. Though some systems lie their face off about what is actually written on disk and plough through anyway. ;) NFS with sync=disabled will let you be a reckless lying sob as well. Which on a tier 2 backup device may be appropriate? Your call.
Add a couple FAST devices to give it a fighting chance. Even a crucial m550, intel 3700, or samsung 850 pro will make a huge difference. You have the tools to make this thing scream, you just happened to throw the worst possible config for speed at it. z3 and dedupe = performance fail without $$$$$.
Good luck. No real numbers, and a lack of known good very fast setups for a backup workload, are just one of the things the get me ranting :) Truth is the guys with mega hardware, just solve their problem or move on to an alternative. We never see what may or may not be excellent solutions. I can throw ssd's at it, or hack a bbu ala jgreco... but zuesram, or fusion-io are out of reach for myself, and most enthusiast types.