RAIDTester
Dabbler
- Joined
- Jan 23, 2017
- Messages
- 45
We've been having checksum issues on iSCSI volumes created in our 60 disk Storinator.
Here's what we saw:
We first had checksum errors on the pool - on random disks, continuing to propagate through all the disks. Not read or write errors, just cksum.
I destroyed the whole pool and recreated. Errors continued. I left it running anyway, continuing to heavily write to the disks. This caused watchdog restarts.
I then started to think it was a bad PSU, but I have no way to confirm, since there are 3 in the box.
I destroyed and re-created the pool again and still experienced issues. Since the pool had 28/30 disks on one Rocket 750 card, it was suggested that I move the card to another slot.
Before I got a chance to do that, I created another zvol on the disks using the OTHER Rocket 750 card in the box. Again, zpool status -v showed permanent issues and checksum errors - as seen below.
Even after destroying the iSCSI extent, the error lingered (even after restart). I believe that's the <0x3b>:<0x1> error.
I created another zvol and another iSCSI extent called r11 and it also had the same errors.
Errors don't go away after a scrub either
Can anyone help? Is something wrong?
ps- We recently upgraded from FreeNAS 9.3 to FreeNAS 9.10
Here's what we saw:
We first had checksum errors on the pool - on random disks, continuing to propagate through all the disks. Not read or write errors, just cksum.
I destroyed the whole pool and recreated. Errors continued. I left it running anyway, continuing to heavily write to the disks. This caused watchdog restarts.
I then started to think it was a bad PSU, but I have no way to confirm, since there are 3 in the box.
I destroyed and re-created the pool again and still experienced issues. Since the pool had 28/30 disks on one Rocket 750 card, it was suggested that I move the card to another slot.
Before I got a chance to do that, I created another zvol on the disks using the OTHER Rocket 750 card in the box. Again, zpool status -v showed permanent issues and checksum errors - as seen below.
Even after destroying the iSCSI extent, the error lingered (even after restart). I believe that's the <0x3b>:<0x1> error.
I created another zvol and another iSCSI extent called r11 and it also had the same errors.
Errors don't go away after a scrub either
Can anyone help? Is something wrong?
ps- We recently upgraded from FreeNAS 9.3 to FreeNAS 9.10
Code:
pool: r10 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://illumos.org/msg/ZFS-8000-8A scan: scrub repaired 0 in 26h49m with 0 errors on Mon Mar 27 02:49:22 2017 config: NAME STATE READ WRITE CKSUM r10 ONLINE 0 0 1 mirror-0 ONLINE 0 0 0 gptid/f67708c5-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/f6fefadb-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 gptid/f7a4b416-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/f8361f95-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 gptid/f8d20cd2-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/f964d99e-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-3 ONLINE 0 0 0 gptid/fa09005a-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/fa9c8222-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-4 ONLINE 0 0 0 gptid/fb43b7ae-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/fbdb9175-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-5 ONLINE 0 0 0 gptid/fc839c48-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/fd1841d8-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-6 ONLINE 0 0 0 gptid/fdae814b-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/fe433f02-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-7 ONLINE 0 0 0 gptid/fef93644-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/ff91f748-f2b8-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-8 ONLINE 0 0 2 gptid/00430ca2-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 2 gptid/00dbc3eb-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 2 mirror-9 ONLINE 0 0 0 gptid/018e1a90-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/022bb531-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-10 ONLINE 0 0 0 gptid/02e30853-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/037e42d8-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-11 ONLINE 0 0 0 gptid/0437d874-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/04d9b66f-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-12 ONLINE 0 0 0 gptid/05945a15-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/063035f2-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-13 ONLINE 0 0 0 gptid/06f56fa9-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/079b0b24-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 mirror-14 ONLINE 0 0 0 gptid/08600f7c-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 gptid/08ffd886-f2b9-11e6-9e1d-0cc47a7693ea ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: <0x3b>:<0x1> r10/r11:<0x1>