System stuck in reboot after 8.0.1 upgrade as well as fresh install - Fatal trap 12?

Status
Not open for further replies.
Joined
Aug 30, 2011
Messages
8
Hello.

I seem to be having a major problem after upgrading my FreeNAS box to the new 8.0.1 release. I did the CD upgrade with no reported install errors. As soon as the system rebooted, it will start booting up normally past the FreeNAS splash screen, past a few of the start-up procedures, and will then halt at the error message attached below in the picture. I tried the same thing by doing a fresh install instead of an upgrade, and same issue, it hits the error message and ends up in a reboot cycle. I don't have the full specs right now, but the box is booting off a 16GB Kingston SATA SSD, is based on a dual core consumer AMD processor, and has 8GB of DDR3 RAM. The main array is based off 4x 2TB Hitachi SATA drives that were running RAIDZ. I have searched the forum and the bug database, but have yet to find an answer. I also did a Google search on the Fatal trap 12 message but no luck either. Help would be greatly appreciated.

Thanks

Update: Tried messing with the BIOS and restored system to factory fail-safe settings but still no luck.
IMAG0936.jpg
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
What version did you upgrade from?
Did you upgrade using the i386 version perhaps?
 
Joined
Aug 30, 2011
Messages
8
Update: I reinstalled the old 8.0 stable release, loaded my backup config file, and my NAS is running just fine again with the main array untouched by the failed upgrade. Therefore, something in 8.0.1 is greatly broken with my hardware. Should I post a system report? If so, how do I do that on a unix box?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Yes, you can file a ticket if you'd like. There is definitely a problem, my system which has been running great just had the same problem last night. I upgraded to 8.01 a week ago.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Here's some additional info on the the errors/problem:

I woke up to find the same screen as @thedeepfriedboot posted.

On reboot, I get these error/warnings:

WARNING: / was not properly dismounted
WARNING: /data was not properly dismounted

Mounting local file systems:mount: /mnt/cortex: No such file or directory (This is my pool mountpoint)
Mounting /etc/fstab filesystems failed, startup aborted
ERROR: ABORTING BOOT (sending SIGTERM to parent)!

Error about not finding passwd_compat, endpwent

Enter full pathname of shell or RETURN for /bin/sh:

I pressed RETURN and got the root # prompt, looked at both /etc/fstab & /conf/base/etc/fstab and they looked fine.
Typed reboot and system came back up normally.

I'm strongly considering downgrading to RC2. It was strange that it happened to one person, but now 2, possibly more. There's something fishy....

More weirdness, it's telling me that /etc/rc.d/jail does not exist (it does), and when I try to start my jail, it tells me that jail doesn't = YES, and it does in both copies of the rc.conf file... WTF!

Going to shutdown, pull the flash drive and fsck it on my FreeBSD system.
 
Joined
Aug 30, 2011
Messages
8
I actually had the same error crash my router as well yesterday. My router runs pfSense which is also FreeBSD based. I upgraded it a week ago to the new release and I got the exact same error screen yesterday. The router was running for a week, and then I came home yesterday to find the router locked up. Perhaps the new version of FreeBSD is to blame for these issues?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
That's a good call actually. Just out of curiosity, what version does your router run? FreeNAS just went from FreeBSD 8.2-RELEASE-p2 to p3, maybe it's related to that. FSCK didn't find anything, now I'm not trusting 8.01 release, my system has been solid until now.
 

jensbylu

Cadet
Joined
Aug 29, 2011
Messages
4
Joining the club. Woke up yesterday morning and found that my queued ftp downloads had frozen during the night and the FreeNAS server was not responding.
Turned on my monitor to the server and voila, the exact same error.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I think this is an unrelated bug, but in my post above I mentioned something about an error with jail=YES, after looking at another FreeNAS systems rc.conf I found that lighttpd=YES didn't have quotes around the YES. This didn't seem to be a problem until the stuff for my jail was added after it basically causing a syntax error. So if you add anything to rc.conf, make sure to put quotes around the YES for lighttpd.

Update: lighttpd was missing quotes, but not the problem, and absolutely unrelated to the much worse Fatal Trap 12! It just happened that the circumstances lead me to find that minor syntax error with lighttpd.
 
Joined
Aug 30, 2011
Messages
8
The router is running pfSense 2.0 stable, and the documentation for pfSense says its based on FreeBSD version "RELENG_8_1". The router and NAS are complete different processors and architectures, so this might less of a hardware compatibility issue and more related to a change in FreeBSD that both the router and NAS rely on. As a comparison, the router is based on a 800Mhz VIA motherboard designed for cash registers that was made at least before 2006 and uses a SATA laptop hard drive, the NAS is 2011 spec hardware with an AMD dual core processor and an SSD hard drive. Also, keep note that the router ran for days without an issue whereas the NAS would not even complete a boot.

Jensbylu, so you are able to boot up just fine with the crash happening latter on, or is booting up a problem for you?
 

jensbylu

Cadet
Joined
Aug 29, 2011
Messages
4
Jensbylu, so you are able to boot up just fine with the crash happening latter on, or is booting up a problem for you?

After a reboot it started ok and still runs fine after ~24h, I'm gonna queue up some heavy ftp transfers tonight and see if that will trigger the error again.
Services enabled on my system is SSH, NFS and SMART.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Well I think I'm onto something, quite by accident and related to the daily email reports not working.

I ran '/usr/sbin/periodic daily' and it crashed with the Fatal Trap 12 error.

Does anyone else want to try and confirm this is the cause?

Edit: This probably explains why it happened during the night when that usually runs.
 
G

gcooper

Guest
Disabling xhci might fix the problem. When booting up the machine do:

1. Boot to loader prompt.
2. unload xchi
3. boot

This driver should really be removed from the default load list.
 
Joined
Aug 30, 2011
Messages
8
Disabling xhci might fix the problem. When booting up the machine do:

1. Boot to loader prompt.
2. unload xchi
3. boot

This driver should really be removed from the default load list.

What does that do? Since my system is crashing right as it starts up, and since I dont have email set up, I dont think its email causing the issue for me.
 
I

ixdwhite

Guest
That's a problem with the USB 3.0 device driver. It doesn't like certain machines, sadly. Is the machine in question one with an AMD chipset by chance? Can you provide the board mfr & model number?

If you can disable USB 3.0 in the BIOS, try that, and make sure the boot volume, if on a USB stick, is in a USB 2.0 port. Otherwise you'll need a custom kernel to remove xhci. This is addressed in future versions by making the xhci driver load as a module so it can be disabled if needed. (That might have made it in the last RC, I've lost track.)
 
G

gcooper

Guest
What does that do? Since my system is crashing right as it starts up, and since I dont have email set up, I dont think its email causing the issue for me.

What Doug said... and I typoed in my instructions above (it's 'unload xhci', not 'unload xchi').
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Disabling xhci might fix the problem. When booting up the machine do:

1. Boot to loader prompt.
2. unload xchi
3. boot

This driver should really be removed from the default load list.

I had already rebooted, but I renamed /boot/kernel/xhci.ko to xchi.ko_OLD (after doing 'mount -uw' of course).
I rebooted and re-ran 'daily periodic' and no crash.

Thanks for the lightening fast answer!
 
I

ixdwhite

Guest
Also for others in this thread --

When reporting panics, it is important to quote the _entire_ panic message, including the addresses. Screenshots are great. There are 3,500 sources of panics (calls to panic()) in FreeNAS. Trap 12 is an invalid memory reference and isn't in itself diagnostic; there are 2^64 possible causes of an invalid memory reference. Along with the fault address, instruction pointer, and source process from the message, we can usually identify what the offending subsystem is. But without it, there's no possible way to debug it.
 
Joined
Aug 30, 2011
Messages
8
That's a problem with the USB 3.0 device driver. It doesn't like certain machines, sadly. Is the machine in question one with an AMD chipset by chance? Can you provide the board mfr & model number?

If you can disable USB 3.0 in the BIOS, try that, and make sure the boot volume, if on a USB stick, is in a USB 2.0 port. Otherwise you'll need a custom kernel to remove xhci. This is addressed in future versions by making the xhci driver load as a module so it can be disabled if needed. (That might have made it in the last RC, I've lost track.)

Dam, that must be it. The board has USB 3.0, is AMD, and is the GIGABYTE GA-880GMA-USB3. Would that board be the problem? Also, if so, would you recommend turning off USB 3.0 in the BIOS, or disabling that xhci file? Oh, also, im not booting off a USB drive, I am booting off a 14GB SSD connected to SATA. Newegg had it on sale.

Update: Disabling USB 3.0 in BIOS worked! Thanks ixdwhite, if you have a reddit account, ill buy you some reddit gold as thanks for the advice. :)
 
Status
Not open for further replies.
Top