I've been thinking about UREs and RAID1 (or RAID10). Long ago, in a galaxy far, far away (the Spiceworks forums), I was reading about UREs. There was a trusted storage resource around those parts named Scott Alan Miller, who posited that RAID1 and RAID10 were "immune" to UREs because they do not use "parity" disks, instead a mirrored clone, which is inherently not at risk for encountering a URE, or rather, that when a URE is encountered, it simply does not cause a failure as this is not a block-by-block rebuild, but instead, a simple file copy (here is the link to the pinned forum post I'm referring to). Subsequently, this person is no longer a member of that forum and flamed out in a very cyberjock-esque fashion, and now gives advice on becoming an ex-pat on his YouTube channel, so maybe he wasn't the right person to be taking advice from.
I made a comment here a few months ago about how I thought RAID1/10 was immune to URE encounters and was quickly corrected that a URE will still kill a RAID1/10 during a resilver. Indeed, I cannot find a single other source on the internet that thinks the same way as the gentleman from Spiceworks, so I am inclined to believe that the one source of opposing data is probably the wrong one, and yet...
I built a 360 disk TrueNAS of mirrors. 180 Mirrors, 14TB disks with a URE of 10^14. It clocks in at 2PB useable and has been running for over 2 years now. Before upgrading to 14TB disks, the same NAS housed 8TB disks. The whole monster has been running without data loss since 2016. It holds old cold WORM data and has lived most of its life at or around 85% - 90% capacity. I've replaced at least 30 - 40 failed disks over the years, and never sweated a resilver. I've never lost any data, but the math of a 10^14 URE says that I should encounter a URE while rebuilding a 12TB disk nearly 100% of the time, right? I know the math doesn't say exactly that, but what I'm getting at is if a URE should kill a 12TB disk during resilver pretty darn frequently then am I the luckiest man on the face of the Earth? Should this NAS be admitted to the Smithsonian to be kept as a monument to the hubris of man not doing enough research? I'd love to hear some thoughts!
EDIT: @Arwen @danb35 I hope you don't mind me tagging a few of you whose opinion I've learned to respect over the years!
I made a comment here a few months ago about how I thought RAID1/10 was immune to URE encounters and was quickly corrected that a URE will still kill a RAID1/10 during a resilver. Indeed, I cannot find a single other source on the internet that thinks the same way as the gentleman from Spiceworks, so I am inclined to believe that the one source of opposing data is probably the wrong one, and yet...
I built a 360 disk TrueNAS of mirrors. 180 Mirrors, 14TB disks with a URE of 10^14. It clocks in at 2PB useable and has been running for over 2 years now. Before upgrading to 14TB disks, the same NAS housed 8TB disks. The whole monster has been running without data loss since 2016. It holds old cold WORM data and has lived most of its life at or around 85% - 90% capacity. I've replaced at least 30 - 40 failed disks over the years, and never sweated a resilver. I've never lost any data, but the math of a 10^14 URE says that I should encounter a URE while rebuilding a 12TB disk nearly 100% of the time, right? I know the math doesn't say exactly that, but what I'm getting at is if a URE should kill a 12TB disk during resilver pretty darn frequently then am I the luckiest man on the face of the Earth? Should this NAS be admitted to the Smithsonian to be kept as a monument to the hubris of man not doing enough research? I'd love to hear some thoughts!
EDIT: @Arwen @danb35 I hope you don't mind me tagging a few of you whose opinion I've learned to respect over the years!