Boot pool status is DEGRADED

rmccullough

Patron
Joined
May 17, 2018
Messages
269
I received an alert today that my boot pool was degraded.

Here is the output of "zpool status":
Code:
  pool: freenas-boot
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
    repaired.
  scan: scrub repaired 0B in 00:19:29 with 0 errors on Mon Oct  3 04:04:29 2022
config:

    NAME          STATE     READ WRITE CKSUM
    freenas-boot  DEGRADED     0     0     0
      mirror-0    DEGRADED     0     0     0
        ada0p2    ONLINE       0     0     0
        da10p2    FAULTED     90   728 90.8K  too many errors

errors: No known data errors


I received this alert at 4:03am:
Code:
* Boot pool status is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected..


This is a SuperMicro SATADom. Do I need to be worried? It looks like the time of this alert coincides with the boot pool scrub:
Code:
Scrub of pool 'freenas-boot' started.
2022-10-03 03:45:00 (America/Denver)

Scrub of pool 'freenas-boot' finished.
2022-10-03 04:04:41 (America/Denver)

Device: /dev/ada0, Temperature 41 Celsius reached critical limit of 40 Celsius (Min/Max ??/41).
2022-10-03 04:04:51 (America/Denver)


Do I need to be worried? Does this mean the SATADom boot disk is failing?
 

rmccullough

Patron
Joined
May 17, 2018
Messages
269
Moreover, why are SATADOM drives so expensive on Amazon now? Where is the best place to purchase a replacement if I want to stick with a SATADOM? Or am I better off moving to mirrored thumb drives?

I know the ideal solution is a small SSD (or a pair of them mirrored), but my case doesn't have extra drive slots internally that I am aware of that I could mount them with. Am I missing something with my case?
 

rmccullough

Patron
Joined
May 17, 2018
Messages
269
Ok, so I looked a little more because I didn't understand why my boot pool was referencing da10. I think I had installed a SanDisk Cruzr Fit a while back and mirrored it to my SATADOM boot disk. I then looked at dmesg today and see this at the end:
Code:
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5c 06 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 3 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5c 06 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 2 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5c 06 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 1 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5c 06 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 0 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5c 06 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Error 5, Retries exhausted
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5e 72 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 3 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5e 72 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 2 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5e 72 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 1 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5e 72 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 0 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 5e 72 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Error 5, Retries exhausted
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 60 ee 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 3 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 60 ee 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 2 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 60 ee 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 1 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 60 ee 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 0 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 01 51 60 ee 00 00 80 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Error 5, Retries exhausted
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 3 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 2 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 1 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Retrying command, 0 more tries remain
(da10:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 00 06 38 00 00 10 00
(da10:umass-sim0:0:0:0): CAM status: CCB request completed with an error
(da10:umass-sim0:0:0:0): Error 5, Retries exhausted
(da10:umass-sim0:0:0:0): got CAM status 0x44
(da10:umass-sim0:0:0:0): fatal error, failed to attach to device
da10 at umass-sim0 bus 0 scbus11 target 0 lun 0
da10: <SanDisk' Cruzer Fit 1.00>  s/n 03008527091721180127 detached
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
(da10:umass-sim0:0:0:0): Periph destroyed


Does this just mean the USB stick died and I need to replace it, add the disk to the boot pool and resilver?
 

rmccullough

Patron
Joined
May 17, 2018
Messages
269
And is the same USB drive still the recommended solution? It looks like I purchased a SanDisk 64GB Cruzer Fit USB 2.0 Flash Drive before. Any reason to try something else? I think my USB ports are only USB 2.0.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
You've definitely burnt out the Sandisk USB device da10 as shown by the end of the dmesg log:

Code:
(da10:umass-sim0:0:0:0): Error 5, Retries exhausted
(da10:umass-sim0:0:0:0): got CAM status 0x44
(da10:umass-sim0:0:0:0): fatal error, failed to attach to device
da10 at umass-sim0 bus 0 scbus11 target 0 lun 0
da10: <SanDisk' Cruzer Fit 1.00>  s/n 03008527091721180127 detached
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
g_access(961): provider da10 has error 6 set
(da10:umass-sim0:0:0:0): Periph destroyed


USB thumbdrives are no longer a popular or preferred option, due to their varying quality and tendency to fall apart under sustained write workloads. A USB-to-SATA or USB-to-M.2 adapter is a viable alternative, backed by a standard SSD.

USB 2.0 isn't a concern as the speed of your boot-pool only comes into play when applying updates and initially booting - and any name-brand SSD will be more than responsive enough even when limited to the 40MB/s of USB 2.0
 
Top