JR Arseneau
Cadet
- Joined
- Apr 8, 2014
- Messages
- 4
Hi all,
I am by no means new to the world of storage. Having worked professionally in this environment for a while and having run my own NAS at home, I feel pretty comfortable navigating around these waters. I've used many enterprise-grade systems but I'm relatively new to ZFS (although I've poured over the literature). In fact, at the office, a few years back I lead the initiative to implement an Isilon scale-out NAS solution which has been purring since 2010.
I know there are a bunch of threads about various configuration paradigms, but I wanted to throw my hat into the ring and get some honest-to-good feedback from the resident experts in here. So here we go:
Goal
My goal is to create a NAS that will grow as my data needs grow. Very much like the Isilon (for those who are aware), you can add an extra storage node (a node is, for example, a 2U server with 12TB of space that gets pooled into the rest of the cluster, which only presents a huge block of contiguous space), I will periodically need to increase the capacity of this NAS. With the kind of space I will be dealing with, it will be difficult (or extremely costly from a monetary point of view) to "copy the data, destroy the zpool, rebuild the zpool and copy the data back".
History
Requirements
My requirements aren't many, I don't think they're unique, but here they are in order of priority.
Questions on Configuration with ZFS
Sorry for being so long, but I've now gotten to where people here can help. Hopefully some people are in the same boat as I. I feel my situation is getting more difficult to manage as my space grows (which is why the SnapRAID + mhddfs is appealing, but flakey). If I create a ZFS zpool of 18-24TB, I am most likely stuck with that for a while. It will not be feasible for me to find 18-24TB of "extra" space to copy my data and rebuild the zpool.
I use Crashplan (cloud version) with a custom 448-bit encryption key and I backup my entire NAS to the cloud. Yes, currently, my entire 13TB of used space is backed up there. So I think I'm good for backups, but obviously, my ISP would probably shit a brick if I downloaded (or attempted to) 13TB in a month,let alone the overage costs.
Any experts care to chime in? Am I missing something obvious here? Is ZFS possibly the wrong solution for what I'm trying to do? I think were BTRFS more mature, it would be a better "fit" for me. But RAID5/6 is experimental, likely won't make it into a "stable" kernel this year and even when it does go stable, it won't have the maturity ZFS has. I also like how ZFS doesn't really have the concept of partitions, and allows me to change the configuration (dedupe, compression) for each data set.
Many thanks!
Cheers,
JR
I am by no means new to the world of storage. Having worked professionally in this environment for a while and having run my own NAS at home, I feel pretty comfortable navigating around these waters. I've used many enterprise-grade systems but I'm relatively new to ZFS (although I've poured over the literature). In fact, at the office, a few years back I lead the initiative to implement an Isilon scale-out NAS solution which has been purring since 2010.
I know there are a bunch of threads about various configuration paradigms, but I wanted to throw my hat into the ring and get some honest-to-good feedback from the resident experts in here. So here we go:
Goal
My goal is to create a NAS that will grow as my data needs grow. Very much like the Isilon (for those who are aware), you can add an extra storage node (a node is, for example, a 2U server with 12TB of space that gets pooled into the rest of the cluster, which only presents a huge block of contiguous space), I will periodically need to increase the capacity of this NAS. With the kind of space I will be dealing with, it will be difficult (or extremely costly from a monetary point of view) to "copy the data, destroy the zpool, rebuild the zpool and copy the data back".
History
- 2006: First home NAS running under Gentoo mdm raid5 (500GB disks). I expanded once or twice by adding a 500GB disk and growing the raid5. Capacity: 2.5TB
- 2009: Decided I wanted to mix-and-match disks while I grow (to save costs), migrated NAS to UnRAID. Capacity: 6TB
- 2012: Unhappy with the flakey performance of UnRAID, slow updates, inadequate plugin architecture and various other unpleasantries, I decided to move to Linux + FlexRAID. Capacity: 10TB
- 2013 (Dec): Unhappy with FlexRAID (it crashed A LOT under linux), had a few drives fail and the recovery of the data when I replaced the drive did not fully complete and also the author's decision to focus on Windows development first and treat Linux as a second-class citizen for his upcoming tRAID solution, I decided to move from FlexRAID to SnapRAID + mhddfs for Pooling various disks. Capacity: 18TB (it grew significantly in a year because I bought extra disks, thinking I'd use ZFS, but I got a bit worried, so settled on SnapRAID temporarily)
Requirements
My requirements aren't many, I don't think they're unique, but here they are in order of priority.
- A contiguous block of space without having to deal with various partitions/zpools/etc.
- Ability to grow the the contiguous space as capacity is needed.
- Bitrot protection (this has bit me in the ass a few times, especially with UnRAID and FlexRAID)
- Ability to replace smaller drives with bigger ones. In other words, in order to meet #2, I don't want to have to have 30-40 drives. I had 4x1.5TB drives before, but they are no longer in use, replaced with 3TB drives last year (copied the data from the 1.5TB drives to the new 3TB drives and that was it).
- Stability and maturity - it has to work. I've wasted so much time tinkering with mdm, UnRAID, FlexRAID, SnapRAID, oh my!
- Speed isn't so much a factor, if I can saturate a GbE link, that's good enough for me.
Questions on Configuration with ZFS
Sorry for being so long, but I've now gotten to where people here can help. Hopefully some people are in the same boat as I. I feel my situation is getting more difficult to manage as my space grows (which is why the SnapRAID + mhddfs is appealing, but flakey). If I create a ZFS zpool of 18-24TB, I am most likely stuck with that for a while. It will not be feasible for me to find 18-24TB of "extra" space to copy my data and rebuild the zpool.
I use Crashplan (cloud version) with a custom 448-bit encryption key and I backup my entire NAS to the cloud. Yes, currently, my entire 13TB of used space is backed up there. So I think I'm good for backups, but obviously, my ISP would probably shit a brick if I downloaded (or attempted to) 13TB in a month,let alone the overage costs.
- Solution 1: Zpool comprised of multiple mirrored ZFS. There is a lot of wasted space, and every time you want to expand, it costs you 2 drives. However, you can (I believe, someone could confirm) "swap" out (expand) each of the mirrored vdev's with bigger disks. So if 1 of the vdev's in the zpool was 2x3TB, I think ZFS allows me to replace one of those with a 4TB, wait for resilver and then replace the other one with 4TB, resilver and by the end, I'll have gained 1TB of space. This solution doesn't provide the most IOPS, but should be able to max a GbE link. There is of course still a risk that both drives in the vdev could theoretically fail, taking the entire zpool with it. I think this would be rare (someone can correct me), and because it's a mirror, the resilver should be much quicker.
- Solution 2: Zpool comprised of multiple RAIDZ2 vdev's. This means less "wasted" space, more redundancy. However, expanding means I have to add (if I want to be consistent, seeing as you can't convert a mirrored or raidz1 vdev to raidz2) a minimum of 4 or ideally 6 extra disks. This means if I have a zpool of 6x3TB (~12TB usable), I'd have to buy 6 more drives to expand (at current pricing of appx $140 for 3TB, that's $840 to expand - ouch). In a few years, as the 3TB's get older, I could replace with 5 or 6TB drives, but again, the expansion isn't amortized, it's in large chunks.
Any experts care to chime in? Am I missing something obvious here? Is ZFS possibly the wrong solution for what I'm trying to do? I think were BTRFS more mature, it would be a better "fit" for me. But RAID5/6 is experimental, likely won't make it into a "stable" kernel this year and even when it does go stable, it won't have the maturity ZFS has. I also like how ZFS doesn't really have the concept of partitions, and allows me to change the configuration (dedupe, compression) for each data set.
Many thanks!
Cheers,
JR