Spontaneous reboots

Status
Not open for further replies.

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Hi all!

My FreeNAS box rebooted itself spontaneously twice in the last ten minutes.

How do I start investigating the issue? Which log should I look at? Does it reboot itself when it gets Kernel Panic?

Background: ran version 8.0 for a year without issues. Just now installed 8.02, went through a lot of troubles with GPT corruption, etc. Fixed everything, and now I have spontaneous reboot. Running 32bit version, have 4GB ram. Running off a CF card, have a ZFS array of 4 disks.

Please help, I've no idea where to start, know nothing about BSD.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Now the thing froze saying something like:
panic: kmem_malloc (...): kmem_map too small 329928704 total allocated cpuid=1
repeated 3 times with different '...'

What to do? loader.conf?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,858
How did you perform the upgrade? From 8.0 to any new version requires installation from iso file and the boot device needs 2GB or more (changed from 1GB).

My recommendation is: Grab either a different CF card or a USB thumb drive (2GB or more) and remove your current boot device. Download 8.0.2-Release .iso file, burn to CD. Insert the new CF or USB thumb drive in the computer and boot the CD. Follow the installation as a new install, do not upgrade (incase the CF/USB has a previous image). Now reboot. It's possible it will boot twice but it should not go further than that.

Is there a reason you are running the 32 bit version over the 64 bit version? What CPU do you have?

-Joe
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Hi Joe, thanks for reply.

I upgraded by doing a fresh install. Used dd command on linux to get the image over onto CF card. Was it in any way possible to do it any other way than new install? I don't think so.
My CF card is 8GB. I think it should be enough.
The CPU is a P4, so 64 bit is not an option unfortunately. But it was sufficient for the last while.

Basically I don't believe that a re-install is a way to go, because the initial install was done pretty much as per your instruction.

Here's my loader.conf:

> cat /boot/loader.conf
#
# Boot loader file for FreeNAS. This relies on a hacked beastie.4th.
#
autoboot_delay="2"
loader_logo="freenas"
#Fix booting from USB device bug
kern.cam.boot_delay=10000

# GEOM support
geom_mirror_load="YES"
geom_stripe_load="YES"
geom_raid3_load="YES"
#geom_raid5_load="YES"
geom_gate_load="YES"
ntfs_load="YES"
smbfs_load="YES"
xhci_load="YES"

hw.hptrr.attach_generic=0

Unchanged since the installation.

kmem_map is 330MB as I understand.
Is it a good number for a 4GB machine? Or am I looking in a totally wrong direction and this is not the issue?

FAQ says that you should set the kmem_map to half your RAM except that there's a limit of 768MB. So I tried setting it to that limit:

vm.kmem_size=805306368
vm.kmem_size_max=805306368
vfs.zfs.arc_max=402653184

Panic on boot.
kmem_suballoc: bad status return of 3

Tried setting it to 512MB, system boots, but still unstable as soon as I try to copy stuff over NFS.

So 512 is too little 768 is too small.
Is anything possible here without kernel recompile?
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Hi again dmt0,

I have a 32-bit version of 8.0.2 compiled with KVA_PAGES=512 which will allow you to adjust stuff in loader.conf much better. Your settings for 4GB of RAM are too low. Send me a PM and I'll send you a link.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
Using version compiled by protosd with KVA_PAGES=512
Went through a few settings. At this point have the following in my loader.conf:

vm.kmem_size="1536M"
vm.kmem_size_max="2048M"
vfs.zfs.arc_min="256M"
vfs.zfs.arc_max="1024M"
vfs.zfs.prefetch_disable="0"

The system is almost stable, but still crapped out a couple of times. I can't identify the scenario yet - whether it's a lot of small files being copied over or a big volume of files, etc.
I can't set vm.kmem_size to 2GB, since kernel panics at startup, while 1536M does not seem to be enough at all times.

Either way, the system is much more stable now, but have to tinker a bit more.

protosd, thanks again for your rescue, and yes, I definitely agree that this should be in the official i386 release.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Note that setting kmem_size and kmem_size_max to different values doesn't do anything. There is an algorithm in the kernel that runs when the system boots that determines the size of the kmem_map, and these two variables are parameters that it uses to determine the actual size. Having the two variables be different is useful only if you are going to ship the settings to any number of machines with unknown configurations.

Generally setting the ARC to a SMALL value increases stability at the possible expense of throughput. I would try setting arc_max to 10M (and remove the setting for arc_min) to see how that works out.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
So the guy with 768M RAM did this:

vm.kmem_size="330M"
vm.kmem_size_max="330M"
vfs.zfs.arc_max="40M"
vfs.zfs.vdev.cache.size="5M"


praecorloth with 1.5G RAM did this:

vm.kmem_size="512M"
vm.kmem_size_max="512M"
vfs.zfs.arc_max="60M"
vfs.zfs.vdev.cache.size="10M"


so me with 4G RAM decided to do this:

vm.kmem_size="1536M"
vm.kmem_size_max="1536M"
#vfs.zfs.arc_min="256M"
vfs.zfs.arc_max="128M"
vfs.zfs.vdev.cache.size="16M"
vfs.zfs.prefetch_disable="0"

Values chosen quite arbitrarily.
So now the system is stable. My issues seemed similar to praecorloth's in a sense that it would panic after a certain volume of data copied to it. It doesn't happen any more - it's stable, but the performance is bursty.
I added this:

vfs.zfs.txg.timeout="5"

The bursts became more even - around 10 seconds each with 10 seconds of no activity, doing around 30MB/sec when active (or 15MB/s on average).
I used to get better performance before this last upgrade though.
Network interface has mtu 9000 set.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Very cool, good progress. I was confusing arc_max with cache_size when I suggested 10M but looks like you found reasonable settings. Keep massaging them until you find something that works and stays stable.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
so me with 4G RAM decided to do this:

vm.kmem_size="1536M"
vm.kmem_size_max="1536M"

You can just delete/comment out the vm.kmem_size because having them both the same doesn't do any good. Then vm.kmem will just use what it needs to UP TO the max.

Glad to hear you've made it to more stable territory and can move on to getting stuff copied around.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Having them different doesn't do any good! :D

As I mentioned earlier, the size of the kmem map is determined just once at bootup. The algorithm uses the size and size_max to determine the map size. When I'm back on a fullsized computer instead of a phone I'll dig up a link to the algorithm. At any rate, the kmem map cannot 'grow' on a running system. It's initialized once and when it is full the kernel panics.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
That's why I said
You can just delete/comment out the vm.kmem_size because having them both the same doesn't do any good. Then vm.kmem will just use what it needs to UP TO the max.
 

dmt0

Dabbler
Joined
Oct 28, 2011
Messages
47
No guys,
When I left out kmem_size, and just made kmem_size_max=2GB, the thing panicked with a value of kmem=1GB.
I've never looked at the algorithm, but to me it makes sense to set the lower bound.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
No guys,
When I left out kmem_size, and just made kmem_size_max=2GB, the thing panicked with a value of kmem=1GB.
I've never looked at the algorithm, but to me it makes sense to set the lower bound.
Yes, that is what I have been saying. Proto is saying you can just leave it out, I don't believe that. The algorithm is in kern_malloc.c (see for instance http://www.leidinger.net/FreeBSD/dox/kern/html/d9/d5a/kern__malloc_8c_source.html). Basically the algorithm is:

- Step 1: If kmem_size_scale is set, limit kmem_size to (1/kmem_size_scale) * physical_memory_size
- Step 2: If resulting kmem_size < kmem_size_min) set new limit kmem_size = kmem_size_min
- Step 3: If resulting kmem_size > kmem_size_max) set new limit kmem_size = kmem_size_max
- Step 4: If kmem_size is set directly, set new limit kmem_size = kmem_size (the manually tuned value)
- Step 5: If kmem_size > 2 * physical_memory_size, set new limit kmem_size = 2 * physical_memory_size

All this is happening between lines 700 - 755 in the link I provided.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Durk, you can just leave it out. I used to have it set the way dmt0 has it, but with the max=2G. I actually sent him my loader.conf with it commented out and I haven't had any issues. I'll take a look at the link you posted too. I'm not trying to say you're wrong, I think it was actually suggested somewhere else to do what I've done and set the max and not set the min and I thought it was strange but tried it and it worked.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
It's interesting. Do you have kmem_size_min set? This is on an i386 build, right? I think by default the kmem_size_scale is set to 3 on that platform, which means that the first guess at kmem_size that the kernel does is 1/3 of the physical RAM, which would be 1.3GB on a 4GB system. At that point kmem_size_max shouldn't make a difference anymore if you set it to 2GB or 1.5GB since that is already bigger than the first guess.

What is the actual value of kmem_size that you can see from the output of "sysctl -a"? How about kmem_size_min, kmem_size_max, kmem_size_scale? Sorry, bit of a derailment of the thread I guess, but it looks like dmt0 has his original problem semi under control and this discussion on kmem_map size is still relevant. I don't think a lot of people know how the settings are really used by the kernel in preparation for the final suballoc command (line 774 in my link).
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
I'm using amd64. The build I posted for dmt0 is i386, 8.02 release compiled with KVA_PAGES=512. I thought it was strange that dmt0 said his system hung on boot when using my settings and then it worked by setting kmem_size.

Do you have kmem_size_min set?

No

Here are my kmem values from sysctl
vm.kmem_map_free: 1018548224
vm.kmem_map_size: 962711552
vm.kmem_size_scale: 1
vm.kmem_size_max: 2147483648
vm.kmem_size_min: 0
vm.kmem_size: 2147483648

(I've gotta run out for awhile, I'll check back here later)
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Code:
vm.kmem_map_free: 1018548224
vm.kmem_map_size: 962711552
vm.kmem_size_scale: 1
vm.kmem_size_max: 2147483648
vm.kmem_size_min: 0
vm.kmem_size: 2147483648
Those settings make sense considering the kernel vars you have. I'm not sure how much RAM is in your box, but let's say it's 8GB. Step 1 of the algorithm sets the candidate value of kmem_size to 8GB. This is because on amd64, kmem_size_scale defaults to 1 (as your output shows). The Step 3 limits the candidate kmem_size to 2GB (which is your kmem_size_max). At that point setting kmem_size directly in loader.conf won't change anything.

In my case I have none of the kmem variables in loader.conf and my kmem_size is actually 8GB according to "sysctl -a". This makes sense again because this is an amd64 system with 8GB of RAM. Only Step 1 of the algorithm does anything useful in this default case.

In general if you want to peg kmem_size to a particular value of your choosing it makes more sense to leave out the kmem_size_min and kmem_size_max and just specify kmem_size directly. That way Step 2 and 3 are skipped and Step 4 sets the value to whatever you specified.

I think there used to be an if-statement around Step 4 in the code so that to truly peg the size to a hardwired value you had to start out with a very small value (by changing scale to an arbitrarily high number, let's say 1000). But looking at the code now, Step 4 overrides whatever value has been computed up until that point.

I haven't looked at the actual code that's in the kernel build that FN8 uses by the way; my build machine is turned off. It's possible that the algorithm is now slightly different again.
 

ProtoSD

MVP
Joined
Jul 1, 2011
Messages
3,348
Thanks for that explanation Durk. I have 4GB of RAM, which is why I thought my settings should work for dmt0 also. I haven't had the chance to look at the source yet, but your explanation left me with one burning question...
At that point setting kmem_size directly in loader.conf won't change anything.

If setting kmem_size in loader.conf won't change anything, what's the point of having it? I mean it obviously has some effect.

I'm not trying to peg my kmem to any fixed value, but I thought the point of setting the kmem_max was to say 'use what you need up to the MAX'.

Since learning that the amd64 build was also intended for Intel 64-bit processors, this hasn't been an issue for me, but I know a lot of people are dragging their old hardware out of the closet expecting ZFS to do miracles on 'bread and water' ;-)

Sorry, today has been/still is one of those days for me and I'm just not in the frame of mind to dig into this right now, but thanks for clearing up some of the details.
 
Status
Not open for further replies.
Top