The boot volume state is UNAVAIL

Status
Not open for further replies.

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
So I was in the midst of running burn in tests when I receive the email

The boot volume state is UNAVAIL: One or more devices are faulted in response to IO failures.

Now I can't access my box from the web interface... I was ssh'd into the box and never lost access but I have a stream of weird error messages on the monitor I had plugged in directly to the box. Should I reboot my box? What's weird is my burn in test is still running and I can still send commands through ssh. my tests are 24 hours in, should I just leave them running until they finish and then reboot the box?

The errors repeat:

Code:
READ(16). CBD: 88 00 00 00
CAM status: SCSI Status Error
SCSI status: Check Condition
SCSI sense: ABORTED COMMAND asc: 47, 3 (Information unit iuCRC error detected)
 
Last edited:

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
That's a shame! What a time for your boot drive to fail.
Do you have mirrored boot devices, if the answer is yes,
let the tests finish. The email was warning that one of
the mirrors is having trouble, so it sounds like for now
everything is ok...
 

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
That's a shame! What a time for your boot drive to fail.
Do you have mirrored boot devices, if the answer is yes,
let the tests finish. The email was warning that one of
the mirrors is having trouble, so it sounds like for now
everything is ok...

So you think the boot drive failed? If so why am I still able to send commands from SSH?

Do you mean one of my HDD's is having trouble when you see the mirror?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
The volume is unavailable, not just degraded, so I guess it wasn't mirrored.

The system still works because it runs in RAM but as soon as you do something that needs to access something on the system volume then it doesn't work.
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
The volume is unavailable, not just degraded, so I guess it wasn't mirrored.

The system still works because it runs in RAM but as soon as you do something that needs to access something on the system volume then it doesn't work.
I believe what your saying is true, so this means the test can complete as long as the power stays on and the machine is not rebooted, RIGHT???
 

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
The volume is unavailable, not just degraded, so I guess it wasn't mirrored.

The system still works because it runs in RAM but as soon as you do something that needs to access something on the system volume then it doesn't work.
So I should have had it mirrorred to avoid this?

What could cause that to happen? Especially so soon, it's only been running for a couple days.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep, in theory you can even use the shell builtins (cd, echo, ...) :)

Edit: yes, mirroring is a good idea with USB drives ;)

USB drives are crappy, some fail really fast...
 
Last edited:

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
Yep, in theory you can even use the shell builtins (cd, echo, ...) :)

Edit: yes, mirroring is a good idea with USB drives ;)

USB drives are crappy, some fail really fast...
It's actually on a SATA DOM.

So when you say mirroring, do you mean basically having a parity partition? Or do you mean having another drive I mirror install on?

Also is my DOM garbage now?
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
Yep, in theory you can even use the shell builtins (cd, echo, ...) :)
I don't know how much memory he has in that system, but if you start other tasks/commands and such and run the memory too low...
 

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
I don't know how much memory he has in that system, but if you start other tasks/commands and such and run the memory too low...
32 GB
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
You should be ok with that much. If I were you, I would not touch the machine until those tests
are over with. At this point (24hrs. into the tests) it would be a shame to start over.
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
I don't know how much memory he has in that system, but if you start other tasks/commands and such and run the memory too low...

Yep, of course if the system needs to swap...

Just for the fun story we've done nasty experiments with Linux systems back when I was in a grande école: we all had our own system drives and we rack them in a PC to work, and of course you can un-rack the drive with the PC still powered... :D well, we tested what happens when you do that, and more interestingly what happens when you re-rack it in...

The answer is that you can un-rack and re-rack the drive all day without a problem (excepted drive physical wear...). When un-racked you can't access anything that isn't in RAM of course, but when you re-rack it everything goes back to normal. Of course with Windows the result isn't the same...

Linux resilience is pretty incredible :)
 

BigDave

FreeNAS Enthusiast
Joined
Oct 6, 2013
Messages
2,479
back when I was in a grande école

Ooooh, I'm impressed. Not easy to qualify from what I understand. Did you attend CPE Lyon?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
It's relatively easy to qualify if you love programming (and english), but if you don't then don't even think to go this path :)

EPITECH, just right next to Paris.
 
Last edited:

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
How often do these things fail? I've only had the DOM for 24 hours, would a solid state disk be more resilient? Also given that I'm going to have to install FreeNAS on a new drive, will I not have to redo my test? I'm only on the bad block section of my hard drive testing, when bad blocks finishes do I just continue with the tests like normal? Or would I have a failure if I try to run my smart tests?

Should I attempt to start the return process? I only got it a few days ago. What exactly should I say went wrong with it?

Also, if I stuck another disk in now would I be able to mirror the drives using only the command line?
 
Last edited:

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
A DOM should not fail as often as the USB sticks, and definitely not fail that fast. Maybe it's juste bad luck.

I'd let badblocks finishes (if it's over 20-30 %, else you don't lose too much time if you relaunch it after the reinstall), then reinstall the system, then continue the testing/burn-in ;)

You can do and must use the GUI for that, everything is described in the manual ;)
 

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
It won't let me SSH in anymore, :(

Won't I have to restart bad blocks anyway on the new disk? Otherwise how will I get the results from bad blocks?
 

STREBLO

Patron
Joined
Oct 23, 2015
Messages
245
Okay this is quite odd I couldn't ssh in anymore so I restarted the machine and it started up normally. When I open the web interface there are no alert messages... Should I just startup badblocks again? Why would something like this happen and then when I restart the machine there be no error messages?

I guess I should probably do a disk check on my SATA DOM as well to make sure it's not defective, what would be the best way to do this?
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Yep, that's normal, a current SSH session will continue to work but you can't open a new one.

Badblocks is for the data drives, not the system drive.

Wait, did you use badblocks on the system drive?
 
Status
Not open for further replies.
Top