But right now this server exists between our tier1 and Tape backup, so I can afford to have some fun and experiment. The goal for this is Tier2, quick individual file/version restore without having to special request from Iron Mountain.
I appreciate the suggestion, but it seems like there are way more things to try before spending a truckload of money.
Also, I already bought the hardware, so I'm not sure how a solution that's sold with hardware will help me at this point.
FreeNas is said to have a great community of knowledgeable enthusiasts, so I'm checking that out first.
I had looked at XiSystems before deciding to try a "roll-your-own" approach to the problem.
So far my takeaways are:
1) The De-dupe cake is a lie
2) Get SSDs for L2ARC and logs immediately
3) ignore CIFS as a share protocol
4) Maybe do or don't make additional datasets within RaidZ to improve performance
5) Just tried that thin-provisioned Zvol shared out over iSCSI (around the same 30MB/s moving data), so clearly a non-starter.
6) No obvious way to blink a failed drive enclosure to find a failed drive... so RaidZ repair steps are somewhat unknown (power down server and hunt for serial number?)
It seems to me you are hitting some of the issues involved in moving literally millions of small files. You need to be thinking about setting up so your iops can address all the overhead in managing building out those masses of files. That means MANY vdevs and a FAST slog. Worry about better use of space and z3 after you have performance minimums in hand.
There are a few things that don't make sense at all. You are on a 10Gb network... but you haven't shown or verified network throughput. You haven't shown that the machine and network are even capable of moving data fast under best case scenarios. How fast does a large file move to the freenas server? If you had a small striped ssd pool i.e the lowest latency/seek possible and a reasonable slog... could you saturate 10gbe during a test? How bout 500 MBps?
You have a real budget, and if high throughput on this brutal workload is viable, you may want to look at a fusion-io or similar. Your 20-50k options are utilizing those kind of advantages.
220.127.116.11 helped CIFS out somewhat with small files. Doesn't make sense to ignore it to me... but ymmv. Samba is single threaded.. so your extra slow cores are not gonna help at all. It is going to top out early... but not this early. Even a Mini can do 350MBps (see cyberjocks review).
I think you are dead on wrt dedupe. Maybe it is viable... but on a sub 50k box there are gonna be issues. Offline versions ala windows are interesting, but don't seem very elegant or powerful.
No interface to your raid controller/ or blink, no hot-swap... welcome to bsd. ;)
Everyone moving many small files faces the same challenges, as does every platform. Though some systems lie their face off about what is actually written on disk and plough through anyway. ;) NFS with sync=disabled will let you be a reckless lying sob as well. Which on a tier 2 backup device may be appropriate? Your call.
Add a couple FAST devices to give it a fighting chance. Even a crucial m550, intel 3700, or samsung 850 pro will make a huge difference. You have the tools to make this thing scream, you just happened to throw the worst possible config for speed at it. z3 and dedupe = performance fail without $$$$$.
Good luck. No real numbers, and a lack of known good very fast setups for a backup workload, are just one of the things the get me ranting :) Truth is the guys with mega hardware, just solve their problem or move on to an alternative. We never see what may or may not be excellent solutions. I can throw ssd's at it, or hack a bbu ala jgreco... but zuesram, or fusion-io are out of reach for myself, and most enthusiast types.
ZFS is multithreaded.
Samba4 is single threaded on a per-user and per-connection basis. (I think ignoring CIFS is going horribly overboard)
iSCSI has it's own benefits and limitations in comparison to file sharing protocols.
Dedup is completely irresponsible in your application. The money spent on having enough RAM in the system to handle dedup will be way... wayyyyyyy more than you are going to save unless you are expecting block-level dedup to give you 10 fold decrease (or more). Dedup just isn't feasible for the vast majority of users because the cost per GB of RAM versus TB of disk space is skewed VERY VERY much to hard drive's favor. Talk to me when 32GB DIMMs are $50 and i'll tell you the opposite. ;)
Every protocol has limitations for small files. iSCSI will too.
I think you need to slow down and instead of trying every setting to see what works and what doesn't you need to understand the technology. It's specific combinations that work very well. Everything else works like crap, and you probably won't hit that combo that does work. For asystem of your size, if you want it to do what you want you should get a consultant to handle this, you should be read to spend some serious time trying to learn this crap (it is NOT going to come fast), or you need to just pay some company like iX to do the hard thinking for you.
There is a reason why people will gladly drop $75k+ on a system like this. You can't just do this in your garage on a Saturday afternoon like your last 5 gaming rigs. It takes deciding on proper hardware along with proper software settings to make it all work. If you can't put that all together on paper before you spend the money you are likely going to end up buying all this hardware and then wondering why it just can't perform.
In the end, without weeks and weeks to play and read and hack away, I ended up solving this problem for $500.
Out went FreeNas and SAS HBA, in went win2k12 and Supermicro/LSI 2308. made 3 Raid6 arrays with 12 drives each, spanned them in windows storage spaces. Getting between 4.7-5.5Gbe over the 10Gbe from my Tier1 storage cluster, around 500-650MB/s using SMB3>SMB3 connection.
honestly the gains on SMB3 over SMB2 or AFP are so great that this will probably be my only option. I was able to move 17+ TB onto the rig just today, and i have the option to enable Dedupe later if I feel like it or run out of space.
I'm still really excited about what next gen file systems have to offer, and I'm annoyed to be abandoning all that hashing and check-summing ZFS claims to offer, but I wasn't just fighting FreeNas+ZFS here. the LSI 9300-8i was giving me all kinds of problems with SAS/Slot mapping my enclosure. that P.O.S. had to go back.
Anyways, this is in no way to degrade FreeNas or the community, I just ran out of time. And I totally understand that real storage costs real money. My other car is a $250k Mediagrid :)
Hopefully by presenting them with a 120TB at $6000, they'll let me grab a Spectra Logic T200 next spring.
Hey, there's no shame in not using FreeNAS. Obviously we'd prefer everyone have all the time and money they need to do FreeNAS. But fantasy isn't reality.
We've also had plenty of people that just aren't smart enough to figure out FreeNAS. Again, no shame if/when that happens to people. I couldn't rebuild a car engine for my life, and I wouldn't want to teach my car mechanic how to be a pro at FreeNAS. :)
Heh. I still default there myself, Robert. It's no small feat to escape the mighty clutches of Microsoft. ;) I keep tryin', but rarely get any closer. Gotta admit I'm smitten with zfs, but still haven't bet hard and deep with client money or data. One day.
haha, although I can appreciate your subtle dig at my intelligence and attempt to start some kind of flame exchange @cyberjock i can assure you that the notion of "think", "smart", or "R&D" don't really come into play when one's job is to herd the pack of cats known as photographers and video editors. It's pretty much all firefighting, 24/7.
I wasn't subtley digging at your intelligence. In fact, I wasn't being subtle or direct at all. I was simply stating that for some people that is the case. I will not, under any circumstances, attempt to determine what you are or aren't capable of because I don't have enough information to actually argue the "deep down" reason.
The only thing I actually do know is you tried FreeNAS and went with a MS product.
I didn't assume anything when I made my discussion, and you shouldn't assume I said anything either.
In fact, someone could argue that you yourself are trying to start a flame exchange by assuming I said anything. If you've read this forum at all you'd know that I don't do anything subtle. If I think you're an idiot, I'd call you out on it and tell you that you are an idiot, and I'd do it directly to your face.
Maybe you should be a little less offended by comments that aren't even directed at you by name.