paul.warwicker.1
Dabbler
- Joined
- Apr 30, 2016
- Messages
- 12
I have some permanent errors on one of my volumes, but these are shown as hex codes. For example:
Reading back through older forum posts suggests that the only way to resolve this was to restore from a backup and recreate the pool.
The volume appears to be okay and the errors I was seeing at 11.2 (https://www.ixsystems.com/community/threads/losing-zfs-pool-overnight.75994/) are no longer causing me an issue at 11.1. Maybe I just got unlucky on the last reboot because these usually cleared on reboot or when I removed any temporary volumes used during testing.
I found a very useful post here http://unixetc.co.uk/2012/01/22/zfs-corruption-persists-in-unlinked-files/ which discussed the issue. Towards the end of the article, it suggests doing a scrub and then stopping that scrub immediately.
The net result is that I now have a clean volume. Question is, has anyone else been in this position and doubted the reliability? It is almost too easy!
-paul
Code:
...
pool: oracle02
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://illumos.org/msg/ZFS-8000-8A
scan: resilvered 2.77M in 0 days 00:08:18 with 0 errors on Mon Apr 29 00:00:12 2019
config:
NAME STATE READ WRITE CKSUM
oracle02 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/74ea4d6e-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/762f22f2-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/770f7817-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/77f3f862-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/78e9f9f4-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
<0x722>:<0x0>
<0x78c>:<0x0>
<0x2b9>:<0x0>
<0x2b9>:<0x572>
<0x6c6>:<0x15>
<0x6c6>:<0x32>
<0x6cc>:<0x0>
<0x6cc>:<0xe>
oracle#Reading back through older forum posts suggests that the only way to resolve this was to restore from a backup and recreate the pool.
The volume appears to be okay and the errors I was seeing at 11.2 (https://www.ixsystems.com/community/threads/losing-zfs-pool-overnight.75994/) are no longer causing me an issue at 11.1. Maybe I just got unlucky on the last reboot because these usually cleared on reboot or when I removed any temporary volumes used during testing.
I found a very useful post here http://unixetc.co.uk/2012/01/22/zfs-corruption-persists-in-unlinked-files/ which discussed the issue. Towards the end of the article, it suggests doing a scrub and then stopping that scrub immediately.
The net result is that I now have a clean volume. Question is, has anyone else been in this position and doubted the reliability? It is almost too easy!
Code:
...
pool: oracle02
state: ONLINE
scan: scrub canceled on Mon Apr 29 22:29:15 2019
config:
NAME STATE READ WRITE CKSUM
oracle02 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
gptid/74ea4d6e-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/762f22f2-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/770f7817-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/77f3f862-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
gptid/78e9f9f4-ffe3-11e8-aec2-941882388da4 ONLINE 0 0 0
errors: No known data errors
oracle#-paul