Unrecoverable error in freenas-boot pool?

David E · Sep 23, 2015

Hello-
I woke up today to an unpleasant email:

Code:

Checking status of zfs pools:
NAME           SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
tank      16.2T  3.40T  12.8T         -      -    20%  1.00x  ONLINE  /mnt
freenas-boot  19.9G   481M  19.4G         -      -     2%  1.00x  ONLINE  -

  pool: freenas-boot
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 36K in 0h0m with 0 errors on Tue Sep 22 15:10:17 2015
config:

        NAME        STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     1     0
          da0p2     ONLINE       0     1     0

errors: No known data errors

I know very little about the freenas-boot pool, but I'm pretty sure it used to have da0p1 in it - which still exists if I 'ls /dev'. As I understand it, both of these partitions are on the same drive da0, so I'm not even sure how this failure is able to happen. What are the right next steps here for me to take?

Thanks in advance!

DaveF81 · Sep 23, 2015

Anything in your syslog to indicate an issue? Last time I had this myself I found my USB boot drive was failing and needed to be replaced.

David E · Sep 23, 2015

I found some more log data (below), FreeNAS is itself running in a VM on an ESXi 5.5 host, the boot disk is a virtual disk (backed by mirrored SSDs), and the tank pool is attached to a LSI HBA passed through to the VM. I did a bit of Googling and it seems a few people have seen things like the below happen on virtual disks, it is a first for us since this has been bulletproof for nearly two years. No obvious fix suggestions that I could find other than disabling MSI/MSIX - but I want this on for the actual hardware LSI card. Is there an easy way to just add this back to the pool and keep an eye on it?

mpt0: request 0xffffff8000ad36c0:24994 timed out for ccb 0xfffffe0005c43000 (req->ccb 0xfffffe0005c43000)
mpt0: attempting to abort req 0xffffff8000ad36c0:24994 function 0
mpt0: completing timedout/aborted req 0xffffff8000ad36c0:24994
(da0:mpt0:0:0:0): WRITE(10). CDB: 2a 00 00 52 b4 c2 00 01 00 00
(da0:mpt0:0:0:0): CAM status: Command timeout
(da0:mpt0:0:0:0): Retrying command
mpt0: abort of req 0xffffff8000ad36c0:0 completed
mpt0: request 0xffffff8000ad5ac0:25060 timed out for ccb 0xfffffe0005c43000 (req->ccb 0xfffffe0005c43000)
mpt0: attempting to abort req 0xffffff8000ad5ac0:25060 function 0
mpt0: completing timedout/aborted req 0xffffff8000ad5ac0:25060
(da0:mpt0:0:0:0): WRITE(10). CDB: 2a 00 00 52 b4 c2 00 01 00 00
(da0:mpt0:0:0:0): CAM status: Command timeout
(da0:mpt0:0:0:0): Retrying command
mpt0: abort of req 0xffffff8000ad5ac0:0 completed
mpt0: request 0xffffff8000ad5b50:25062 timed out for ccb 0xfffffe0005c43000 (req->ccb 0xfffffe0005c43000)
mpt0: attempting to abort req 0xffffff8000ad5b50:25062 function 0
mpt0: completing timedout/aborted req 0xffffff8000ad5b50:25062
(da0:mpt0:0:0:0): WRITE(10). CDB: 2a 00 00 52 b4 c2 00 01 00 00
(da0:mpt0:0:0:0): CAM status: Command timeout
(da0:mpt0:0:0:0): Retrying command

David E · Sep 23, 2015

Did some more investigation, I had though that there were two devices in my boot pool, but it looks like just one.. things seem to be working and nothing apparently actually died during that time, so hopefully this doesn't happen again.

DaveF81 · Sep 23, 2015

Just as I thought, boot drive is failing. You will get the same scrub failure next time. I'd suggest backing up your config and replacing the drive soon.

Edit: I missed the part regarding ESXI. It appears your boot image has become corrupt somehow. I still stand by backing up your config and performing a fresh install on a new VMDK.

Important Announcement for the TrueNAS Community.

Unrecoverable error in freenas-boot pool?

David E

Contributor

DaveF81

Explorer

David E

Contributor

David E

Contributor

DaveF81

Explorer

Similar threads