8.2.0-p1 Kernel Panic.. Possible bug???

Status
Not open for further replies.

Letni

Explorer
Joined
Jan 22, 2012
Messages
63
Folks, I have a Compaq Microserver N36L running 8.2.0-p1 (upgraded online from 8.0.4p2) which now exhibits unstable characteristics as I am now going through my third kernel panic with 8.2.0-p1..

Here is my hardware:
Compaq N36L
8 GB DDR3
Additional PCIe 1x Intel NIC (not bonded to onboard NIC) - dedicated for iSCSI to ESX 5.0
Mixture of 1 and 2 TB drives (5 total) in a ZFS pool
APC USB connected UPS

Here are the services I use:
1. CIFS
2. AFP
3. NFS
4. RSYNC
5. iSCSI Target (from devices created in Zpool)
6. TFTPD


Everything has been just fine with this box starting with 8.0.2 when I built it, having upwards of 100 days uptime at times.

Last Friday I upgraded to 8.2.0-p1 and since have had 3 KP where the whole machine locked up and has to be hard rebooted. The longest I have had it go is 3.5 days.

The load on the box is VERY idle, I have a handful of VMs running off of iSCSI which are for test (they do practically nothing), as well as tons of NAS data set up for SMB and AFP (also very idle)

The Kernel Panic text on the screen is the following:

Jul 31 00:03:36 freenas afpd[57976] dsi_stream_read: len:0, unexpected EOF
Jul 31 00:03:37 freenas afpd[57976] dsi_stream_read: len:0, unexpected EOF


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x20
fault code = supervisor read data, page ot present
instruction pointer = 0x20:0xffffffff80671127
stack pointer = 0x20:0xffffffff8244bd86f0
frame pointer = 0x20:0xffffffff824bd8730
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres1, long1, def32 0 ,gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 1629 (NLM: master)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 3d21h41m17s

Anyone have any ideas how to debug this here why i'm getting some sort of page fault not found????
 

peterh

Patron
Joined
Oct 19, 2011
Messages
315
As for my personal limited experience i usually blaims hardware for these types of problems. But i have noticed that
freenas 8.2.x has a number of occurencys of trap 12 , and in your case using the same hardware that has worked flawlessly with 8.0.x Thus i have to adjust my view of this and say "it's likley OS bug(s)"
The next logical step would be to install 8.0.4 and use that instead (i hope you do not upgraded your pools )
 

ben

FreeNAS GUI Developer
Joined
May 24, 2011
Messages
373
There's no pool upgrade to worry about between 8.0.4 and 8.2, but services settings cannot be downgraded.
 

Letni

Explorer
Joined
Jan 22, 2012
Messages
63
In process of re-imaging my USB drive back to 8.0.4-p3 and will restore my pre 8.2.0-p1 setting db backup.. We'll see where this goes...
 

mankyd

Dabbler
Joined
May 3, 2012
Messages
10
Are you using NFS? It may be related to this: https://support.freenas.org/ticket/1662

This started for me after upgrading from Beta 2 to Beta 4. It continued with the official release. It's worth adding that I was not connecting via a Window's NFS client. Rather my client was Ubuntu.
 

Letni

Explorer
Joined
Jan 22, 2012
Messages
63
Yes,

Actually I was going to post that it seemed to happen when I was accessing a share via NFS from a Linux (32 bit Centos 6.3) VM (sitting on VMFS)!!!

I'll one-up this bug.

Thanks
 

Letni

Explorer
Joined
Jan 22, 2012
Messages
63
Reverted back to 8.0.4-p3 with my old config.. So far seems OK when initiating large transfers through NFS, which I almost guarantee is the root cause... so can someone update the ticket (#1662) as this also can effect VM environments running Linux (since others note that physical Linux hosts don't have this issue)
 

kattunga

Cadet
Joined
Aug 1, 2012
Messages
1
I have the same problem here with a fresh install as a VM on vmware esxi 4.1.
As soon as I connect to NFS share from a client running Ubuntu 10.04 I get the "Fatal trap 12: page fault while in kernel mode" error.
 

maglaubig

Dabbler
Joined
Sep 29, 2012
Messages
17
I'm getting the EXACT same instruction pointer reference and I can re-produce the error at will, however for me it's not NFS related, it's in CIFS access. On Win7 Ultimate all I have to do is right-click on multiple files or right click on a share and wham, FreeNAS crashes. I've been able to reproduce the same result on two separate systems with different processors and motherboards.

The only thing I think might be similar in these cases is in the underlying file system. NFS/CIFS, these should all be running in user mode, so if the crash they wouldn't cause a kernel panic, unless I'm missing something?

In looking at bugs and forum posts it seems like everyone is talking NFS. I found bug 1662 referencing several other bugs marked as duplicated 1676 and 1666. In all instances it seems like upgrading to the beta version of 8.3 is the fix. Can anyone confirm?
 

jhahn

Cadet
Joined
Sep 28, 2011
Messages
7
Yes, I had the same behavior with the version 8.2. After I have upgraded to version 8.3 beta3, I don't have any troubles with NFS/CIFS.
 

bjornono

Cadet
Joined
Oct 5, 2012
Messages
5
I have started with a fresh install of 8.2 64bit. I have KP:"kmem_malloc(131072) at random interval. Kmem_map to small. It don't try to use all allocated kmem.
I am running 6x2TB WD Caviar Green disks and 8gb memory. I have set allocated 2gb, max 3gb to kmem. Usually I will get the KP with 2gb kmem and several gb free.
If it has been running with all memory, it will never release the Inactive memory back to Free. Not until I reboot. To me it looks like a serious bug in Memory Management in 8.2.
Hope this will be fixed in 8.3, if not this is wasted time to have a server that hangs at random intervals.
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
Do not set tunables to limit memory...

Disable autotune and then delete all System->Tunables. Make sure to reboot.
 

bjornono

Cadet
Joined
Oct 5, 2012
Messages
5
Thanks for the advice.
I have upgraded to 8.3 beta3 and disabled autotune and deleted all entries. In 8.3 beta3 the memory usage looks very mush better than in 8.2 and I haven't had any problems with KP so far.
I got a warning about outdated zpool with 8.3 beta3: zpool outdated. When I try to run "zpool upgrade" it just tells me I am running zpool 28 and my zpool is 15.
What is the syntax to run "zpool upgrade" to upgrade to version 28. I have search the forum but nowhere to find that syntax.
 

Stephens

Patron
Joined
Jun 19, 2012
Messages
496

maglaubig

Dabbler
Joined
Sep 29, 2012
Messages
17
I'm getting the EXACT same instruction pointer reference and I can re-produce the error at will, however for me it's not NFS related, it's in CIFS access. On Win7 Ultimate all I have to do is right-click on multiple files or right click on a share and wham, FreeNAS crashes. I've been able to reproduce the same result on two separate systems with different processors and motherboards.

The only thing I think might be similar in these cases is in the underlying file system. NFS/CIFS, these should all be running in user mode, so if the crash they wouldn't cause a kernel panic, unless I'm missing something?

In looking at bugs and forum posts it seems like everyone is talking NFS. I found bug 1662 referencing several other bugs marked as duplicated 1676 and 1666. In all instances it seems like upgrading to the beta version of 8.3 is the fix. Can anyone confirm?

UPDATE - 8.3 BETA did the trick, no other tuning or anything required. Plus the CIFS access is WAY faster than it was previously.
 
Status
Not open for further replies.
Top