Please confirm my idiocy

Status
Not open for further replies.

nasnuc

Cadet
Joined
Jul 31, 2014
Messages
9
Did I really add a single disk as a stripe, when I thought I was adding a spare?

And, there's no way to undo this mistake short of destroying the volume eh?

Code:
[root@san-zfs-02] /# zpool status
1410448136
  pool: tank
state: ONLINE
  scan: none requested
config:

   NAME                                            STATE     READ WRITE CKSUM
   tank                                            ONLINE       0     0     0
     gptid/0106431d-2eea-11e4-bbcc-a0369f3d2b7c    ONLINE       0     0     0
     raidz1-1                                      ONLINE       0     0     0
       gptid/0156ec1a-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/01a94ffa-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/01fab1af-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/02554ae2-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/02a81dde-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
     raidz1-2                                      ONLINE       0     0     0
       gptid/03023244-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/03533991-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/03a5f8fa-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/03f54119-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0460ef48-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
     raidz1-3                                      ONLINE       0     0     0
       gptid/04c1e894-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0514dcc2-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/056a7c02-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/05bfd7b0-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/06171ff2-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
     raidz1-4                                      ONLINE       0     0     0
       gptid/06882cd6-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/06de59ec-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/073350a0-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/078bd4ff-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/07e380a1-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
     raidz1-5                                      ONLINE       0     0     0
       gptid/084b556b-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/08a14182-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/090cd68d-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/096ce5ac-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/09bf4e94-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
     raidz1-6                                      ONLINE       0     0     0
       gptid/0a33c600-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0aa09dc9-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0b022ba9-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0b60e497-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0bbf2f03-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
     raidz1-7                                      ONLINE       0     0     0
       gptid/0c3c608a-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0c9e0942-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0cff02e0-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0d5aa4fe-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0db538a2-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
   logs
     mirror-8                                      ONLINE       0     0     0
       gptid/0e84e3ba-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
       gptid/0eba3b84-2eea-11e4-bbcc-a0369f3d2b7c  ONLINE       0     0     0
   cache
     gptid/0e160d40-2eea-11e4-bbcc-a0369f3d2b7c    ONLINE       0     0     0
     gptid/0e40543f-2eea-11e4-bbcc-a0369f3d2b7c    ONLINE       0     0     0
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Ouch. Confirmed. No do-overs unfortunately.

Nice sized pool. Raid Z1 doesn't concern you?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Yep, you really did. Gotta destroy the pool. :(

For a pool of that size I never would recommend RAIDZ1. Just as you are about to find out, restoring a very large pool costs resources (time, etc.) and the "cost" of going RAIDZ2 is minimal compared to rebuilding a pool.

I'd recommend you change from RAIDZ1 to RAIDZ2 vdevs. Maybe instead of 5 disk RAIDZ1s you go to 10 disk RAIDZ2s?
 

nasnuc

Cadet
Joined
Jul 31, 2014
Messages
9
I didn't want to hear it. :(

Thanks. Glad I cough this before putting it into production. Oh well, good news is it give me the opportunity to test the deployment doc.

As for RaidZ1, it's a strip over of multiple RaidZ1's (Raid 50ish) and with only 5 disks per pool, and 4 cold spares in addition to the one hot spare, I'll sleep soundly. Besides, it's just the local backup storage that will be replicated offsite.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Umm... I think I should point out a few things:

1. Hot spares do NOT automatically jump in and replace a failed disk.
2. The problem with RAIDZ1 (in particular the issue with UREs during resilvering) does not change the statistical chances of a problem because you resilvered sooner than later.
3. I've seen lots of people do big pools with RAIDZ1. I've never seen one last more than about 7 months before they had to start all over from scratch. All you have to do is corrupt the ZFS metadata in one vdev and the entire pool will become unmountable.

There's more reasons, but that's the big 3. In any case I still disagree with your reasoning for RAIDZ1 and I do wish you the best of luck.
 

nasnuc

Cadet
Joined
Jul 31, 2014
Messages
9
Umm... I think I should point out a few things:

1. Hot spares do NOT automatically jump in and replace a failed disk.
I am aware, we wanted a disk in the system so we can bring it up manually if we need.
2. The problem with RAIDZ1 (in particular the issue with UREs during resilvering) does not change the statistical chances of a problem because you resilvered sooner than later.
3. I've seen lots of people do big pools with RAIDZ1. I've never seen one last more than about 7 months before they had to start all over from scratch. All you have to do is corrupt the ZFS metadata in one vdev and the entire pool will become unmountable.
There's more reasons, but that's the big 3. In any case I still disagree with your reasoning for RAIDZ1 and I do wish you the best of luck.
Ah. I was under the missimpression that the rebuild of a Raid5 and a rebuild of a RaidZ1 were different in so far the ZFS do not allow corruption due to checksums of the data.

Hm. Thanks cyberjock.

So moving to RaidZ2, using 36 disks, and wanting 1) space and 2) writes speed how should I cut up this storage?

6xRaidZ2 pools of 6 disks?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,176
I am aware, we wanted a disk in the system so we can bring it up manually if we need.

Ah. I was under the missimpression that the rebuild of a Raid5 and a rebuild of a RaidZ1 were different in so far the ZFS do not allow corruption due to checksums of the data.

There's the problem. If one of the other "copies" of the block is bad, ZFS refuses to continue, as it would have to do so with invalid data. RAIDZ2 mitigates this by still having one set of parity, so that a URE can still be resolved with data from two drives during a typical rebuild.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I was under the missimpression that the rebuild of a Raid5 and a rebuild of a RaidZ1 were different in so far the ZFS do not allow corruption due to checksums of the data.

Checksums only identify the bad blocks. But if a disk fails and you have to resilver and a checksum finds a problem, where's your parity to fix it? Oops.. you have no parity because you had only 1 disk of parity and that 1 disk is gone. Major problems will ensue. This is why I have the RAID5/RAIDZ1 is dead in my sig. It should NEVER be used with important data.

So moving to RaidZ2, using 36 disks, and wanting 1) space and 2) writes speed how should I cut up this storage?

I'll hand this one off to you as only you can decide what ratio of speed versus space you want. It's a personal thing. ;)
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
6x6 z2 pools is a pretty balanced setup. Should have lots of speed for a backup pool. But at the cost of 12 disks to parity. You can have more space if you go wider say 4x8 or 3x10... It's a straight up trade for space vs. speed (IOPS) all the way down to striped mirrors. Pure throughput shouldn't be an issue. I know cyber went super wide ~20 disks and found things a little disappointing. I always end up going the other way ;).

Heh. He posts as I'm typing. Find the balance that suits you. For a backup workload, you might find that you can get away with going wide and the nics not the pool are ALWAYS the bottleneck. But move 1,000,000 tiny files at a shot... and you'll wish for more vdevs. You are still in the lab, why not test? Glad you decided to change the design.
 
Status
Not open for further replies.
Top