There's a thread a while back, which asks about what happens if something temporarily renders multiple HDDs in a pool, unavailable (but undamaged). Examples might be, a non-redundant HBA/backplane failure, some kind of federated or linked enclosures/pods/backends get their cable pulled, some drives are on a 2nd PSU (non shared rails) which dies, and so on.
According to the replies, it's not a problem, there's a difference between disk unavailability and disk failure, and if disks become unavailable in a way that the pool no longer has one accessible copy of all data, but the data is intact, then the pool will recover when the disks become readable again (after HBA replacement/repair/reconnection). So far so good.
My question looks at that a bit more closely: is there theoretically a different failure mode that can follow from that scenario (HBA loss for example), whereby a pool with mirrored vdevs continues in a degraded state, but the connected and disconnected drives' desyncing would cause loss of the entire pool?
Scenario:
Discussion:
This isn't quite the same as a double failure killing redundancy. All other drives in vdev1 are still in perfect condition and unharmed, and the issue is not that we lost more drives than redundancy would allow. What's happened is that although only one drive was lost (redundancy of HDDs was adequate), the HBA loss desynced the pool into two "halves" (connected and disconnected), at which point the loss of one drive from the "active" half meant it could no longer resync, resilver or roll back to any consistent form, when the missing (intact) HDDs came back online.
In practice, many installs won't have 100% redundant hardware (most home users and even many businesses don't have 100% dual-port SAS drives, dual HBAs, etc). They will have redundant pools, because they were told to, but not redundant HDD power and redundant HDD connections. There should be backups, but one tends to think in terms of ZFS just not having issues of this kind, in restoring a pool, on good hardware, unless we suffer (n-1) actually failed drives in a single vdev or some extreme electrical event damaging multiple drives.
This scenario suggests one could have very robust HDD redundancy but a single lost HBA/backplane plus a single lost HDD could still kill the pool, however many redundant HDDs are in each vdev, and even though all but one copy of the lost vdev's data is actually intact and all but one disk is in perfect condition.
Not at all "worried", but intrigued for sure, and I would like to ask for input on this scenario.
According to the replies, it's not a problem, there's a difference between disk unavailability and disk failure, and if disks become unavailable in a way that the pool no longer has one accessible copy of all data, but the data is intact, then the pool will recover when the disks become readable again (after HBA replacement/repair/reconnection). So far so good.
My question looks at that a bit more closely: is there theoretically a different failure mode that can follow from that scenario (HBA loss for example), whereby a pool with mirrored vdevs continues in a degraded state, but the connected and disconnected drives' desyncing would cause loss of the entire pool?
Scenario:
- Suppose the pool is made of drive 1a/b/c, 2a/b/c and 3a/b/c (123=vdevs and abc=mirrored HDDs).
- An HBA, or a port on an HBA, becomes faulty and takes 1b/c and 3a offline (but unharmed).
(Alternatively, to show it doesn't have to be just HBAs, suppose the PSU is an ordinary, good quality, multiple rail design and one rail stays tripped due to a fault, taking some HDDs offline but not the baseboard and other HDDs) - If at this point the pool halts, no harm is done. That's the scenario in the other thread, but it doesn't always happen. With enough redundancy, each vdev survives, so the pool is only degraded not halted. Ther sysadmin isn't on site, or only gets the email later on. As a result, usual file activities continue without any sign of any issue given to client users of the NAS, until, under the stress, suppose 1a dies a while later and only then does pool activity halt.
When the HBA is fixed and the disks become accessible again, and the NAS reboots, the pool will be massively desynced. (It wouldn't take much activity between steps 2 and 3 above, to do that). vdev1 (1b/c) will contain what it held before the 2nd PSU failure; vdev2 (2a/b/c) will contain only current latest data from when 1a failed, and vdev3 will have 1 drive (3a) in one state and 2 drives (3b/c) in the other. The ZIL is not in a helpful state to identify what's changed, because TXGs going back to the original HBA failure aren't on it any more - they were ditched long ago as the ZIL rolled around.
So it doesn't seem that one could be sure whether a redundant copy exists of any specific pool data or metadata for rebuild, or if any rollback/snaps are viable?
So it doesn't seem that one could be sure whether a redundant copy exists of any specific pool data or metadata for rebuild, or if any rollback/snaps are viable?
Discussion:
This isn't quite the same as a double failure killing redundancy. All other drives in vdev1 are still in perfect condition and unharmed, and the issue is not that we lost more drives than redundancy would allow. What's happened is that although only one drive was lost (redundancy of HDDs was adequate), the HBA loss desynced the pool into two "halves" (connected and disconnected), at which point the loss of one drive from the "active" half meant it could no longer resync, resilver or roll back to any consistent form, when the missing (intact) HDDs came back online.
In practice, many installs won't have 100% redundant hardware (most home users and even many businesses don't have 100% dual-port SAS drives, dual HBAs, etc). They will have redundant pools, because they were told to, but not redundant HDD power and redundant HDD connections. There should be backups, but one tends to think in terms of ZFS just not having issues of this kind, in restoring a pool, on good hardware, unless we suffer (n-1) actually failed drives in a single vdev or some extreme electrical event damaging multiple drives.
This scenario suggests one could have very robust HDD redundancy but a single lost HBA/backplane plus a single lost HDD could still kill the pool, however many redundant HDDs are in each vdev, and even though all but one copy of the lost vdev's data is actually intact and all but one disk is in perfect condition.
Not at all "worried", but intrigued for sure, and I would like to ask for input on this scenario.
Last edited: