Question / Sanity check on new build

madbuda

Cadet
Joined
Apr 10, 2019
Messages
4
Plan:
Slowly grow my primary storage by replacing drives with larger disks then moving them into the backup server

My primary server will be an 11 disk stripe (have 12 bays, need an open bay for expansion or maybe a cold spare).
The backup server will have 24 disk in 4 x 6 disk raidz

Rationale:
Growth of a stripe can be accomplished by replacing individual disks, no need to purchase en masse.
My backup storage on the other hand will grow after 6 disks have been upgraded.

My thought is as I grow my primary storage I will continually have hand me downs for the backup array.

So my question is of course is this feasible? Am I missing something important here?

I spun up a VM and did some testing with replacing disks in a stripe, seems to work just fine. I've grown a vdev over time before by replacing individual disks without issues.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
My primary server will be an 11 disk stripe
Why? ZFS can't do error correction without redundancy.

The plan you have is only horrible in that you can end up with corrupt data and no way to recover from it also a single disk failure can destroy the content of the entire pool.
 

madbuda

Cadet
Joined
Apr 10, 2019
Messages
4
Why? ZFS can't do error correction without redundancy..
The data is easily replaceable either from the backup array or offsite, so the primary is built with expansion in mind not redundancy.

could you expand on "ZFS can't do error correction without redundancy"
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
could you expand on "ZFS can't do error correction without redundancy"
ZFS does a checksum for all data committed to the array. It uses that to determine if the data being read from the array matches what was written. If the checksum does not match, it can reconstruct the data, but only if you have redundancy. Without redundancy, you can't recover from that. There are many disk errors that can cause data errors, such as bad sectors, where the disk is just not giving back the data it was entrusted with. A catastrophic disk failure is not the only reason for redundancy. If your data becomes corrupt, and that gets copied to the backup, you then have no backup and no primary. The primary data store is more important than the backup. Your strategy is not a good one.
 

madbuda

Cadet
Joined
Apr 10, 2019
Messages
4
That was my missing bit of information, I didn't realize there was no checksums without redundancy. (glad I came here and asked)

Maybe I should flip my approach here.

My current setup I am using the 24 disk array as a backup for a 12 disk BTRFS stripe. It sounds like I would be better off making this my primary and running a 12 disk raidz2 as my backup. Slower to grow, but I don't loose out on the benefit of moving to ZFS in the first place.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I didn't realize there was no checksums without redundancy.
It still records the checksum data, but there is no data to use for making the correction unless there is some redundancy. The 'extra' copy of the data that is used for a rebuild to replace a failed disk and the 'extra' copy of the data that is used for repair of data inconsistency is the same 'extra' copy.
If you don't have redundancy, there is no 'extra' copy of data to do any recovery from.
My current setup I am using the 24 disk array as a backup for a 12 disk BTRFS stripe. It sounds like I would be better off making this my primary and running a 12 disk raidz2 as my backup. Slower to grow, but I don't loose out on the benefit of moving to ZFS in the first place.
This is confusing to me. FreeNAS dosn't do BTRFS so I don't know what that is in reference to. I use a 12 disk pool in my FreeNAS server where it is two vdevs of 6 drives each. You can grow the pool by changing six drives in one vdev instead of needing to replace all 12 drives. I have been through that process multiple times as I started with one vdev of 1TB drives, then added a second vdev, then replaced the drives in the first vdev with 2TB drive, then replaced the drives in the second vdev with 2TB drives... and so on... Now I have 4TB drives and I am looking at a time in the next couple years where I will switch to either 8TB or 10TB drives. It is a bit slower going and takes more planning, but ZFS is a more dependable storage than anything else out there. Where I work, I manage several servers that collectively have more than 1PB of data on ZFS and I wouldn't want to do it any other way.
 
Last edited:

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
This is confusing to me. FreeNAS dosn't do BTRFS so I don't know what that is in reference to.
I believe he's using it as a reference for his current system (a 12-drive BTRFS stripe)

I would agree with the intention to make the 24-drive setup your primary storage (configured as 12x2-way mirrors) and then using the 12-drive Z2 as a backup target. Assuming all drives are the same size, you'd end up with 12 drives worth of space in the primary and 10 drives in the backup - although a 12-drive vdev is pretty much at the maximum of my comfort limit. 2x6-drive Z2 vdevs might be better from a write speed and expansion perspective.
 

madbuda

Cadet
Joined
Apr 10, 2019
Messages
4
If you don't have redundancy, there is no 'extra' copy of data to do any recovery from.

Thank you, Looks like 6 disks at a time is where I am headed

I believe he's using it as a reference for his current system (a 12-drive BTRFS stripe)
Exactly this, not freenas. My primary storage today is just btrfs on linux, I use freenas as a backup target (hourly rsync). My goal is to be all ZFS.

I think 12x2-way mirrors is over kill for what I am doing.

Thanks for all the help, you saved me from making a huge mistake here
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
I think 12x2-way mirrors is over kill for what I am doing.
There's no kill like it. ;)

Thanks for all the help, you saved me from making a huge mistake here

Just remember what @Chris Moore said above with regards to having some manner of redundancy built into the primary storage. 12x2 might be overdoing it, but being able to restore from backup won't help if your backups are corrupt.
 
Top