cannot open 'freenas-boot/ROOT': pool I/O is currently suspended

Status
Not open for further replies.

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
Three days ago, I received the following email from my FreeNAS box:
Subject: FREENAS.local: Critical Alerts
message: The boot volume state is ONLINE: One or more devices are faulted in response to IO failures

Shortly thereafter, I received the following:
Subject: FREENAS.local security run output
message: FREENAS.local kernel log messages:
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 07 f6 32 00 00 80 00
> (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error
> (da0:umass-sim0:0:0:0): SCSI status: Check Condition
> (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:umass-sim0:0:0:0): Error 5, Unretryable error
> (da0:umass-sim0:0:0:0): READ(10). CDB: 28 00 00 08 0b ea 00 00 80 00
> (da0:umass-sim0:0:0:0): CAM status: SCSI Status Error
> (da0:umass-sim0:0:0:0): SCSI status: Check Condition
> (da0:umass-sim0:0:0:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error)
> (da0:umass-sim0:0:0:0): Error 5, Unretryable error

-- End of security output --

Then I received the following email:
Subject: FREENAS.local daily run output
message: Checking status of zfs pools:
NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
backup1 460G 283G 177G - 19% 61% 1.00x ONLINE /mnt
data1 1.81T 682G 1.15T - 1% 36% 1.00x ONLINE /mnt
freenas-boot 7.19G 2.44G 4.75G - - 33% 1.00x ONLINE -

pool: freenas-boot
state: ONLINE
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: http://illumos.org/msg/ZFS-8000-HC
scan: scrub in progress since Tue Feb 24 03:45:01 2015
232M scanned out of 2.44G at 2.83K/s, 227h51m to go
0 repaired, 9.27% done
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 4.00K 0 0
da0p2 ONLINE 622 3 120

errors: 4062 data errors, use '-v' for a list

-- End of daily output --

For the last two days, FreeNAS has been emailing once per hour:
Subject: Cron <root@FREENAS> /bin/sh /usr/local/sbin/save_rrds.sh
message: cannot open 'freenas-boot/ROOT': pool I/O is currently suspended

I cannot log into the FreeNAS GUI, but all my shares and Plex are working just fine. I can access the console through IPMI, although I have not tried to enter any commands.

I presume these messages are telling me that my boot device has gone bad. Is that correct? I'm using a single USB flash drive. What would be the wisest course of action? Everything is backed up, so I'm not in danger of losing any important data.
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
"I presume these messages are telling me that my boot device has gone bad. Is that correct?" from what I can see, yes.

Just do a fresh install of FreeNAS on a new USB stick, import the config file (you have a backup of the config?) and you're OK ;)
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
So, to close the loop on this thread...

I installed FreeNAS onto a new USB stick and imported a saved config file. All is good. I have since added a second USB stick and mirrored the boot devices. No issues.

For the record, I have to say I love IPMI. I was able to download a new FreeNAS iso, remote mount it and then boot from the virtual device - which allowed me to run the installer and set up the USB stick. Other than replacing the USB stick itself, I never touched the hardware. My FreeNAS box does not have a DVD or a monitor attached. Everything was done remotely.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Recommend strongly to run a set of two sticks (which I think you are), and set the scrub interval to a short period...say 2 or 5 days or something.

Ever since we've had the ZFS-on-boot, many of us are discovering that our USB thumb drives are throwing checksums left-and-right and no one ever realized it. I recommend vigorous scrubbing of the boot pool, if it is using consumer grade flash.
 

pschatz100

Guru
Joined
Mar 30, 2014
Messages
1,184
I'm running two sticks. Using ZFS for the boot device has made mirroring and monitoring the boot drives incredibly simple.
 
Last edited:

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Recommend strongly to run a set of two sticks (which I think you are), and set the scrub interval to a short period...say 2 or 5 days or something.

Ever since we've had the ZFS-on-boot, many of us are discovering that our USB thumb drives are throwing checksums left-and-right and no one ever realized it. I recommend vigorous scrubbing of the boot pool, if it is using consumer grade flash.
I don't actually know how write-intensive scrubbing is. I'm assuming hopefully only in cases where you have heavy corruptions, but in the case that it's write intensive regardless of corruptions, I'd imagine scheduling scrubs on such short periods would kill the flash drive much faster. Anyone have more insight on the actual scrubbing process?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
There is a thread "Using agressive scrubs on boot drive" or something like that where this has been discussed. TL;DR --> no problem regarding the USb drives, you can set one scrub per day if you want ;)
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
I found the thread, to quote jgreco on it: "Scrubs are basically read-only unless they find trouble."
So it seems that my assumption was correct in that it only matters if you actually do have problems that need to be corrected.
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
Yeah, confirming this. Basically a scrub is a file-based operation (so, for example, it won't check any unused sectors, which is why SMART long tests are an important component in a comprehensive strategy for detecting hardware problems, at least with hard drives). Each block associated to a file is read, and compared against expected values by way of the ZFS integrity check (checksum/hash/whatever), and if no problem is found with a block, no writing is done.

So yes, to my knowledge, there is almost no strain whatsoever on a healthy device induced by ZFS scrubbing. When the device in question is from the boot pool, you'll want to throw it straight into the garbage as soon as you encounter a problem, so, yeah, doesn't matter if you strain it with writes, because by that time, you're throwing it out anyway.
 
Status
Not open for further replies.
Top