sgbotsford
Cadet
- Joined
- Jan 21, 2012
- Messages
- 7
From reading other posts, the functionality I want isn't available yet. So this is a request.
Background. Right now I have a Drobo S that fails under fireware. The combination of Drobo's slothe and USB make for a system that is tolerable (barely) for time machine on a single apple workstation.
One of the features I like about Drobo is the ability to use any combination of disks, and the ability to add additional disks subject to the number of bays. As far as I can tell this is not available on ZFS yet.
Is this correct?
Is there an robust open source solution that does support this feature?
I'm not a file system guru. As a complete idiot if I were trying to implement such a system I would divide each disk in chunks. The size is arbitrary. In the event of a single disk, each chunk is written twice on the disk. If the system is clever and can work through the obfuscation of the drive hardware it will try to locate these copies so that an event that takes out any given cylinder, head, or sector wont take out both copies. Add a disk. One copy of each chunk is moved to the second disk, and the space is freed up on the original disk.
Add a disk. 1/3 of the copies of each disk move to the new disk.
That's how it works with equal disks.
Now consider unequal disks
New disk smaller than old: If there is room, half the data moves over. If not, then the smaller disk is filled to a level compatible with reasonable performance. Remaining data is written twice to larger disk. Operator is notified that not all data is protected. If really clever the FS tries to optimize by putting frequently accessed chunks on both disks.
At 3 unequal disks, it gets more interesting. If the sum of the smaller 2 is smaller than the large disk, then each chunk is written to the large disk and ONE of the smaller disks. Extra space on the larger disk is used in 'write twice' mode. If the sum is larger than the large disk, then all of the large disk can be used, with the smaller disks used in such a way taht the remaining space is equal on both. This space can then be used writing to the two.
At 4 disks my mind starts to boggle at doing it off the top of my head.
So at any given time there are two blobs of storage: One in which data is mirrored on two disks, one in which data is written twice on one disk. The latter is used as 'overflow' as it is both more vulnerable and slow. In the case of equal sized disks, or a combination of disks that can be partitioned into two equal sets, the overflow blob is size zero.
And this is only mirror 2.
3 way mirror is also possible. I will assume that anyone concerned enough to mirror 3 way will not tolerate overflow blobs. So with 3 disks the maximum mirror size is the smallest disk. Add a disk. With 4 disks of size A >= B >= C >= D assign chunks as follows:
Chunks are labeled C1, C2, C3... Considering mirrors, C1a C1b C1c etc.
In initial layout on 3 disks C1a goes on the first disk, b,c on the next two. C2a goes on the second disk, C3a on the third disk.. C4a is back on the first disk again.
When we add a disk, chunks get reshuffled. If the new disk is the new smallest disk (D) then it is used in the same manner as above, using the remnant space of the A and B. If the new disk is largest, (A) then all the chunks from the smallest disk are migrated over to it. Additional chunks can now be mirrored up to the capacity of third largest disk, and then D is treated as a new smallest disk.
You could do raid5 this way too.
The chunk size is arbitrary. I suspect that chunks should be large enough that few file operations will span into two chunks. They should be small enough that the leftovers after dividing up the disk aren't big enough to worry about. Small chunks make more housekeeping to keep track of where each chunk's clone is.
In terms of drive interaction I suspect that a chunk should be large enough to include a complete track, and possibly a complete cylinder. As long as the head is there, slurp up the entire track. So at a guess, a track is the lower bound. Some percentage of the disk is the upper bound. Doing this intelligently would likely require that the system do a large number of reads to create an LBA->Cylinder map. Or some cooperation from drive makers. Probably there are optimizations relating to the cache size of the disk too. But remember that chunks are NOT basic units of IO. They are ways of assigning space. At first blush, I'd make chunks around 10 GB.
Background. Right now I have a Drobo S that fails under fireware. The combination of Drobo's slothe and USB make for a system that is tolerable (barely) for time machine on a single apple workstation.
One of the features I like about Drobo is the ability to use any combination of disks, and the ability to add additional disks subject to the number of bays. As far as I can tell this is not available on ZFS yet.
Is this correct?
Is there an robust open source solution that does support this feature?
I'm not a file system guru. As a complete idiot if I were trying to implement such a system I would divide each disk in chunks. The size is arbitrary. In the event of a single disk, each chunk is written twice on the disk. If the system is clever and can work through the obfuscation of the drive hardware it will try to locate these copies so that an event that takes out any given cylinder, head, or sector wont take out both copies. Add a disk. One copy of each chunk is moved to the second disk, and the space is freed up on the original disk.
Add a disk. 1/3 of the copies of each disk move to the new disk.
That's how it works with equal disks.
Now consider unequal disks
New disk smaller than old: If there is room, half the data moves over. If not, then the smaller disk is filled to a level compatible with reasonable performance. Remaining data is written twice to larger disk. Operator is notified that not all data is protected. If really clever the FS tries to optimize by putting frequently accessed chunks on both disks.
At 3 unequal disks, it gets more interesting. If the sum of the smaller 2 is smaller than the large disk, then each chunk is written to the large disk and ONE of the smaller disks. Extra space on the larger disk is used in 'write twice' mode. If the sum is larger than the large disk, then all of the large disk can be used, with the smaller disks used in such a way taht the remaining space is equal on both. This space can then be used writing to the two.
At 4 disks my mind starts to boggle at doing it off the top of my head.
So at any given time there are two blobs of storage: One in which data is mirrored on two disks, one in which data is written twice on one disk. The latter is used as 'overflow' as it is both more vulnerable and slow. In the case of equal sized disks, or a combination of disks that can be partitioned into two equal sets, the overflow blob is size zero.
And this is only mirror 2.
3 way mirror is also possible. I will assume that anyone concerned enough to mirror 3 way will not tolerate overflow blobs. So with 3 disks the maximum mirror size is the smallest disk. Add a disk. With 4 disks of size A >= B >= C >= D assign chunks as follows:
Chunks are labeled C1, C2, C3... Considering mirrors, C1a C1b C1c etc.
In initial layout on 3 disks C1a goes on the first disk, b,c on the next two. C2a goes on the second disk, C3a on the third disk.. C4a is back on the first disk again.
When we add a disk, chunks get reshuffled. If the new disk is the new smallest disk (D) then it is used in the same manner as above, using the remnant space of the A and B. If the new disk is largest, (A) then all the chunks from the smallest disk are migrated over to it. Additional chunks can now be mirrored up to the capacity of third largest disk, and then D is treated as a new smallest disk.
You could do raid5 this way too.
The chunk size is arbitrary. I suspect that chunks should be large enough that few file operations will span into two chunks. They should be small enough that the leftovers after dividing up the disk aren't big enough to worry about. Small chunks make more housekeeping to keep track of where each chunk's clone is.
In terms of drive interaction I suspect that a chunk should be large enough to include a complete track, and possibly a complete cylinder. As long as the head is there, slurp up the entire track. So at a guess, a track is the lower bound. Some percentage of the disk is the upper bound. Doing this intelligently would likely require that the system do a large number of reads to create an LBA->Cylinder map. Or some cooperation from drive makers. Probably there are optimizations relating to the cache size of the disk too. But remember that chunks are NOT basic units of IO. They are ways of assigning space. At first blush, I'd make chunks around 10 GB.