FreeNAS crash on drive access

Status
Not open for further replies.

WarriorXK

Cadet
Joined
Jun 21, 2012
Messages
6
I seem to have a problem with my FreeNAS installation, occasionally when accessing my RAID 5 set through my FreeNAS Windows share FreeNAS seems to just reboot, after the system has finished rebooting it seems to work just fine.

Some information about my server :

  • FreeNAS runs as a Virtual Machine through ESXi
  • FreeNAS has access to the RAID5 set through ESXi's RDM (Raw Device Mapping)
  • FreeNAS has access to 1 virtual CPU
  • FreeNAS has 1792MB of RAM
  • FreeNAS build is "FreeNAS-8.0.4-RELEASE-p3-x86 (11703)"
  • FreeNAS OS Version is "FreeBSD 8.2-RELEASE-p9"
  • The RAID5 set has been formatted to UFS
  • The RAID5 set runs off a Areca ARC-1222 RAID card
  • The RAID5 set exists out of 4xHD201UI

Any help is appreciated, I'd also like to know if there is a way to look into the logs of FreeNAS (If there are any).
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
FreeNAS is based on an embedded variant of FreeBSD, and useful logs may not persist in such a situation. However! You may be in luck. I suggest you try the following:

1) Turn on a serial port and log to file in the ESXi configuration.

2) Turn on serial console in the FreeNAS configuration.

3) Get it to crash.

This is likely to give you a useful record of what happened.
 

WarriorXK

Cadet
Joined
Jun 21, 2012
Messages
6
Thank you, I've just enabled the serial output, I'll post again in here when I've got results.
 

WarriorXK

Cadet
Joined
Jun 21, 2012
Messages
6
It seems to have crashed again, however this time the console just locks up instead of rebooting, I can enter text or numbers but pressing enter does nothing.

The new output file seems to contain something of an error, however I have no idea what it means, I am not that knowledgeable about BSD systems.

Code:
onsole setup
-------------
        
1) Configure Network Interfaces
2) Configure Link Aggregation
3) Create VLAN Interface
4) Configure Default Route
5) Configure Static Routes
6) Configure DNS
7) Reset WebGUI login credentials
8) Reset to factory defaults
9) Shell
10) Reboot
11) Shutdown

You may try the following URLs to access the web user interface:

http://192.168.0.101/
Enter an option from 1-11: g_vfs_done():ufs/Data[READ(offset=5277655841112064, length=8192)]error = 5

ufs_accessx(): Error retrieving ACL on object (5).

panic: kmem_malloc(4096): kmem_map too small: 335544320 total allocated

cpuid = 0

Uptime: 3h25m35s

Cannot dump. Device not defined or unavailable.


These are the last few lines before it rebooted, full log is here (Including the last few reboots) :
https://dl.dropbox.com/u/3655572/SerialLogFile2.zip
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
"kmem_map too small" basically means it ran out of memory. I wouldn't have expected that on a FFS/UFS-based FreeNAS system, as it is usually ZFS that stresses kvm. I don't have any immediate thoughts about things to try other than to raise the memory you've allocated to the VM, or to maybe tune down some of the resources allocated by the kernel.

I do note that error 5 is "EIO", "Input/output error", but it's unclear to me whether that's maybe an effect or if it's somehow causing the issue. You could try verifying the readability of each of your component disks just to make sure there are no errors reading them. You can go to shell and do something like

# camcontrol devlist
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus0 target 0 lun 0 (pass0,ada0)
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus1 target 0 lun 0 (pass1,ada1)
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus2 target 0 lun 0 (pass2,ada2)
<WDC WD7500AAKS-00RBA0 30.04G30> at scbus3 target 0 lun 0 (pass3,ada3)
< Flash Disk > at scbus6 target 0 lun 0 (pass4,da0)
# foreach i ( 0 1 2 3 )
foreach? dd if=/dev/ada${i} of=/dev/null bs=1048576&
foreach? end
[1] 59311
[2] 59312
[3] 59313
[4] 59314
#

After a few hours all the dd's should finish around the same time, and report the same number of bytes per disk. If one disk finishes early with an error, or if one disk is hanging around an hour longer than the others, you may have a physical disk problem. I'm guessing you don't, but never hurts to check for the obvious.
 

WarriorXK

Cadet
Joined
Jun 21, 2012
Messages
6
I've played around with ESXi's reserved memory settings hoping that it will somehow fix the issue, if this doesn't help I'll try increasing the amount of RAM I have assigned to my FreeNAS VM.

I am curious of what you mean with tuning down the kernel resources, and what this means for my FreeNAS installation, could you elaborate on this?

And unfortunately I am unable to do that disk check, since my FreeNAS installation only sees the RAID5 array, not the individual drives :

FreeNAS_Drives.png
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
What is kern.ipc.nmbclusters set to? What is the output of vmstat -z before the panic? I don't really know what to troll around for, actually, but it seems likely something is sized for more memory than you have, and you can tune it down. I would suggest that you actually not consider this to be a FreeNAS problem, though it is likely a result of some FreeNAS design decisions, and if you want to Google around a bit, look for Free*BSD* problems that are similar, as someone has probably already made these mistakes and killed their FreeBSD system in the same manner.

I wouldn't expect the reserved memory settings to make a whit of difference. The surest bet is to jack up the amount of VM memory for FreeNAS and let ESXi manage any necessary swapping. The most effective bet is to research what's causing the crash and tune accordingly (definitely harder solution than just assigning more memory).

And you can still do a test read on your "da1". Won't hurt. Just in case.
 

WarriorXK

Cadet
Joined
Jun 21, 2012
Messages
6
My FreeNAS installation just refused to boot after a crash without a lot of errors, I had to do something manually from the shell (Appeared to be a chkdsk of sorts) but when executing it it kept coming with y/n questions, I have no idea how this happened, probably due to a crash while writing to the drives and it somehow got corrupted.

Currently I have migrated all my data and the RAID set to my Windows Server 2008 R2 VM, seems to work better than before without locking up, I'd like to thank you regardless for all your help in fixing this problem, but I think I am going to use my Windows Server as fileshare host for now, maybe later I'll try a newer version of FreeNAS when it comes out.

Thanks again for all your help.
 
Status
Not open for further replies.
Top