Processes are being terminated by signal 6

Status
Not open for further replies.

Ravefiend

Dabbler
Joined
Jun 1, 2011
Messages
37
Using FreeNAS 8 and Windows 7 as client via CIFS / Samba I'm facing problems with processes being terminated by a signal 6 and signal 11 errors.

System:
  • Intel Xeon X3440 2.53Ghz Boxed | BX80605X3440
  • 2 x Kingston 4GB 1333MHz DDR3 ECC Reg CL9 DIMM DR x8 w/TS VLP
  • 7 x Western Digital Caviar GreenPower WD20EARS. 2TB | WD20EARS (RAIDZ2 6+1)
  • Dual Intel® 82574L Gigabit Ethernet Controllers (onboard)
  • OCZ Rally 2 8GB USB 2.0 Flash Drive

The system has been up for a little over 15 days now. In the first days of it I moved a few TiB to my NAS without any troubles, even tested a few archives on the NAS to validate that they aren't corrupted as a result of the transfer.

Today I wanted to copy over some more data and that failed shortly after. The logs show that smbd is crashing:

Code:
May 24 01:11:52 freenas freenas[1624]: Executing: /usr/bin/killall smbd
May 24 01:11:52 freenas freenas[1624]: Executing: /usr/sbin/service samba quietstart
May 24 10:11:52 freenas freenas: Removing stale Samba tdb files: ......... done
May 24 01:11:59 freenas freenas[1624]: Executing: /usr/bin/killall nmbd
May 24 10:11:59 freenas freenas: No matching processes were found
May 24 01:11:59 freenas freenas[1624]: Executing: /usr/bin/killall smbd
May 24 10:11:59 freenas freenas: No matching processes were found
May 24 01:11:59 freenas freenas[1624]: Executing: /usr/sbin/service samba quietstart
May 24 10:11:59 freenas freenas: Removing stale Samba tdb files:  done
May 24 10:11:59 freenas freenas: Starting nmbd.
May 24 10:11:59 freenas freenas: Starting smbd.
May 26 07:15:38 freenas sshd[18451]: login_getclass: unknown class 'root'
May 26 07:15:57 freenas last message repeated 2 times
May 26 10:14:12 freenas ntpd[3372]: kernel time sync status change 6001
May 26 10:26:56 freenas ntpd[3372]: kernel time sync status change 2001
May 27 08:11:28 freenas sshd[47889]: login_getclass: unknown class 'root'
May 27 08:11:31 freenas sshd[47889]: login_getclass: unknown class 'root'
May 31 18:26:17 freenas ntpd[3372]: kernel time sync status change 6001
May 31 18:43:23 freenas ntpd[3372]: kernel time sync status change 2001
Jun  1 06:18:47 freenas kernel: pid 98482 (smbd), uid 0: exited on signal 6
Jun  1 06:18:48 freenas kernel: pid 98490 (smbd), uid 0: exited on signal 6
Jun  1 06:18:48 freenas kernel: pid 98491 (smbd), uid 0: exited on signal 6
Jun  1 06:19:19 freenas kernel: pid 98495 (smbd), uid 0: exited on signal 6
Jun  1 06:19:20 freenas kernel: pid 98496 (smbd), uid 0: exited on signal 6
Jun  1 06:19:24 freenas kernel: pid 98497 (smbd), uid 0: exited on signal 6
Jun  1 06:19:24 freenas kernel: pid 98498 (smbd), uid 0: exited on signal 6
Jun  1 06:19:27 freenas kernel: pid 98499 (smbd), uid 0: exited on signal 6
Jun  1 06:19:27 freenas kernel: pid 98500 (smbd), uid 0: exited on signal 6
Jun  1 06:19:28 freenas kernel: pid 98501 (smbd), uid 0: exited on signal 6
Jun  1 06:19:28 freenas kernel: pid 98502 (smbd), uid 0: exited on signal 6
Jun  1 06:19:28 freenas kernel: pid 98503 (smbd), uid 0: exited on signal 6
Jun  1 06:19:29 freenas kernel: pid 98504 (smbd), uid 0: exited on signal 6
Jun  1 06:19:29 freenas kernel: pid 98505 (smbd), uid 0: exited on signal 6
Jun  1 06:19:29 freenas kernel: pid 98506 (smbd), uid 0: exited on signal 6
Jun  1 06:19:30 freenas kernel: pid 98507 (smbd), uid 0: exited on signal 6
Jun  1 06:19:30 freenas kernel: pid 98508 (smbd), uid 0: exited on signal 6
Jun  1 06:19:46 freenas kernel: pid 98509 (smbd), uid 0: exited on signal 6
Jun  1 06:20:01 freenas kernel: pid 98514 (python), uid 0: exited on signal 11
Jun  1 06:22:55 freenas freenas[1624]: Executing: /usr/bin/killall nmbd
Jun  1 06:22:55 freenas freenas[1624]: Executing: /usr/bin/killall smbd
Jun  1 06:22:55 freenas freenas[1624]: Executing: /usr/sbin/service samba quietstart
Jun  1 15:22:55 freenas freenas: Removing stale Samba tdb files: ........ done
Jun  1 06:23:02 freenas freenas[1624]: Executing: /usr/bin/killall nmbd
Jun  1 15:23:02 freenas freenas: No matching processes were found
Jun  1 06:23:02 freenas freenas[1624]: Executing: /usr/bin/killall smbd
Jun  1 15:23:02 freenas freenas: No matching processes were found
Jun  1 06:23:02 freenas freenas[1624]: Executing: /usr/sbin/service samba quietstart
Jun  1 15:23:02 freenas freenas: Removing stale Samba tdb files:  done
Jun  1 15:23:02 freenas freenas: Starting nmbd.
Jun  1 15:23:02 freenas freenas: Starting smbd.
Jun  1 06:23:09 freenas kernel: pid 98720 (smbd), uid 0: exited on signal 6
Jun  1 06:23:09 freenas kernel: pid 98721 (smbd), uid 0: exited on signal 6
Jun  1 06:23:09 freenas kernel: pid 98722 (smbd), uid 0: exited on signal 6
Jun  1 06:23:10 freenas kernel: pid 98723 (smbd), uid 0: exited on signal 6
Jun  1 06:23:10 freenas kernel: pid 98724 (smbd), uid 0: exited on signal 6
Jun  1 06:23:11 freenas kernel: pid 98726 (smbd), uid 0: exited on signal 6
Jun  1 06:23:11 freenas kernel: pid 98727 (smbd), uid 0: exited on signal 6
Jun  1 06:23:11 freenas kernel: pid 98728 (smbd), uid 0: exited on signal 6
Jun  1 06:23:12 freenas kernel: pid 98729 (smbd), uid 0: exited on signal 6
Jun  1 06:23:12 freenas kernel: pid 98730 (smbd), uid 0: exited on signal 6
Jun  1 06:23:12 freenas kernel: pid 98732 (smbd), uid 0: exited on signal 6
Jun  1 06:23:12 freenas kernel: pid 98733 (smbd), uid 0: exited on signal 6
Jun  1 06:25:00 freenas kernel: pid 98812 (python), uid 0: exited on signal 11
Jun  1 06:25:08 freenas kernel: pid 43454 (csh), uid 0: exited on signal 11 (core dumped)
Jun  1 15:25:39 freenas login: ROOT LOGIN (root) ON ttyv4
Jun  1 06:26:17 freenas kernel: pid 98850 (csh), uid 0: exited on signal 11 (core dumped)
Jun  1 15:26:22 freenas login: ROOT LOGIN (root) ON ttyv4
Jun  1 06:30:00 freenas kernel: pid 99096 (python), uid 0: exited on signal 11
Jun  1 06:35:00 freenas kernel: pid 99368 (python), uid 0: exited on signal 11
Jun  1 06:40:00 freenas kernel: pid 99627 (python), uid 0: exited on signal 11
Jun  1 15:42:39 freenas sshd[99764]: login_getclass: unknown class 'root'
Jun  1 15:42:43 freenas sshd[99764]: login_getclass: unknown class 'root'
Jun  1 06:45:01 freenas kernel: pid 99971 (python), uid 0: exited on signal 11
Jun  1 06:50:00 freenas kernel: pid 634 (python), uid 0: exited on signal 11


As you can see, smbd dies because of a signal 6 and the periodicly executed python jobs fail with a signal 11. Even a simple csh dares to crash so there's clearly something wrong at the OS level. Top shows me the following information:

Code:
last pid:  1231;  load averages:  0.03,  0.01,  0.00                                                                                                                             up 15+03:39:59  15:55:26
47 processes:  1 running, 46 sleeping
CPU:  0.1% user,  0.0% nice,  0.1% system,  0.0% interrupt, 99.8% idle
Mem: 58M Active, 21M Inact, 4431M Wired, 22M Cache, 142M Buf, 1621M Free
Swap: 4096M Total, 54M Used, 4042M Free, 1% Inuse

  PID USERNAME    THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1711 root          7  44    0 70160K  3828K ucond   0   6:53  0.00% collectd
 7570 root          1  44    0  9224K  1484K select  6   2:45  0.00% top
86474 root          1  44    0 15560K   868K nanslp  0   2:35  0.00% zpool
 1624 root          6  44    0   139M 55492K uwait   2   1:09  0.00% python
 2055 www           1  44    0 19328K  2032K kqread  2   0:54  0.00% lighttpd
 3372 root          1  44    0 11780K   892K select  2   0:24  0.00% ntpd
 1821 root          1  47    0  7832K   388K nanslp  4   0:04  0.00% cron
...


Correct me if I'm wrong here, but there's clearly more then enough memory available. What possible reasons can explain this behavior of crashing processes? I'd want to figure this out first before restarting FreeNAS as that'll surely fix things, but possibly only temporarily.

Edit: Diskspace can't explain things either:

Code:
freenas# df 
Filesystem             1K-blocks        Used      Avail Capacity  Mounted on
/dev/ufs/FreeNASs1a        468735     409051      22185    95%    /
devfs                           1          1          0   100%    /dev
/dev/md0                     4526       2316       1848    56%    /etc
/dev/md1                      686         10        622     2%    /mnt
/dev/md2                    76526      28194      42210    40%    /var
/dev/ufs/FreeNASs4          20263        892      17750     5%    /data
mpool                   953679750         21  953679729     0%    /mnt/mpool
rpool                  4308178077         56 4308178021     0%    /mnt/rpool
...
 
Joined
May 27, 2011
Messages
566
signal 11 is a segfault, the program tried to access memory it does not have permission to use.

is this a default install or have you been messing around.
can you post your smb.conf file?

can you increase your samba logging to level 3 and post the log file after a crash.
 

Ravefiend

Dabbler
Joined
Jun 1, 2011
Messages
37
signal 11 is a segfault, the program tried to access memory it does not have permission to use.

is this a default install or have you been messing around.
can you post your smb.conf file?

can you increase your samba logging to level 3 and post the log file after a crash.

Default install of FreeNAS 8.0 though have set up my zpool config via GPT labels / partitions. Works quite well and data sets were created via the WebUI. Haven't made any changes to kernel parameters via loader.conf or alike.

I've gone ahead and stopped the CIFS service and changed the log level to 3 (via smb.conf). Started CIFS again and the smbd process crashed at the moment that I was trying to access my NAS via \\STORAGE\.... so didn't even had to copy anything to it.

Code:
...
Jun  3 02:10:00 freenas kernel: pid 42968 (python), uid 0: exited on signal 11
Jun  3 02:15:00 freenas kernel: pid 43031 (python), uid 0: exited on signal 11
Jun  3 02:20:00 freenas kernel: pid 43091 (python), uid 0: exited on signal 11
Jun  3 02:25:00 freenas kernel: pid 43154 (python), uid 0: exited on signal 11
Jun  3 11:28:01 freenas sshd[43193]: login_getclass: unknown class 'root'
Jun  3 11:28:04 freenas sshd[43193]: login_getclass: unknown class 'root'
Jun  3 02:30:00 freenas kernel: pid 43331 (python), uid 0: exited on signal 11
Jun  3 02:31:28 freenas kernel: pid 43497 (smbd), uid 0: exited on signal 6
Jun  3 02:31:28 freenas kernel: pid 43498 (smbd), uid 0: exited on signal 6
Jun  3 02:31:41 freenas kernel: pid 43514 (smbd), uid 0: exited on signal 6
Jun  3 02:31:43 freenas kernel: pid 43517 (smbd), uid 0: exited on signal 6
Jun  3 11:32:04 freenas sshd[43542]: login_getclass: unknown class 'root'
Jun  3 11:32:04 freenas sshd[43542]: login_getclass: unknown class 'root'
...


Uploaded the logs and smbd.conf as you requested:
View attachment log.smbd.txt | View attachment smb.conf.txt | View attachment log.nmbd.old.txt | View attachment log.nmbd.txt | View attachment log.smbd.old.txt

Configuration wise, these are the highlights of the CIFS service:

Large RW support : [enabled]
Send files with sendfile(2) : [enabled]
EA Support : [disabled]
Support DOS File Attributes : [enabled]
Auxiliary parameters : socket options = SO_KEEPALIVE TCP_NODELAY IPTOS_LOWDELAY
Enable AIO : [enabled]
Minimal AIO read size : 8192
Minimal AIO write size : 8192

The only other service running is SSH. Do note that not only smbd is crashing. Even the following simple action causes csh to crash:

1. Open up an SSH session as root
2. Type cd /var/log[tab]
--> connection closed and on the console I get "Jun 3 02:43:29 freenas kernel: pid 43194 (csh), uid 0: exited on signal 11"

In addition, notice the python process crashing every 5 minutes. So seems like all running processes are OK yet as soon as something tries to allocate a little to much memory, it crashes.
 
I

ixdwhite

Guest
Running FreeNAS 8 with less than 6GB RAM (or on i386) is extremely dicey right now due to ZFS's ARC being too hungry and depriving the system of needed memory. It's on my list to figure out how to tune ZFS to temper its appetite and not squeeze the rest of the system out.

If you're feeling adventurous -- I'm not responsible for your FreeNAS box eating your cat if you change these -- look at these sysctls:
vfs.zfs.arc_min
vfs.zfs.arc_max

Take the current value of vfs.zfs.arc_max, cut it in half, and add it to /boot/loader.conf and reboot and see if that stabilizes things. If that is below arc_min then make arc_min match it.

It's also possible your system simply has a bad stick of RAM and is randomly zeroing out chunks of memory which causes problems (and usually panics on top of program malfunctions.) memtest86+ would be a good thing to run on the machine to make sure there are no issues there.
 

Ravefiend

Dabbler
Joined
Jun 1, 2011
Messages
37
Running FreeNAS 8 with less than 6GB RAM (or on i386) is extremely dicey right now due to ZFS's ARC being too hungry and depriving the system of needed memory. It's on my list to figure out how to tune ZFS to temper its appetite and not squeeze the rest of the system out.

Got to understand that point quite well upfront when I started my endeavors on FreeNAS so yes, I got things running with 2 x Kingston 4GB Registered ECC memory, 8GB in total, running FreeNAS 8.0 (amd64). I consider this to be a 'supported' configuration.

If you're feeling adventurous -- I'm not responsible for your FreeNAS box eating your cat if you change these -- look at these sysctls:
vfs.zfs.arc_min
vfs.zfs.arc_max

Take the current value of vfs.zfs.arc_max, cut it in half, and add it to /boot/loader.conf and reboot and see if that stabilizes things. If that is below arc_min then make arc_min match it.

Current sysctl values, all being the default ones as I have not modified /boot/loader.conf yet :
  • vfs.zfs.prefetch_disable : 0
  • vfs.zfs.vdev.min_pending : 4
  • vfs.zfs.vdev.max_pending : 10
  • vfs.zfs.cache_flush_disable : 0
  • vfs.zfs.arc_min : 900980224
  • vfs.zfs.arc_max : 7207841792

All the other settings are available in 110607_sysctl_signal_6_11.txt, and list of all running processes.

It's also possible your system simply has a bad stick of RAM and is randomly zeroing out chunks of memory which causes problems (and usually panics on top of program malfunctions.) memtest86+ would be a good thing to run on the machine to make sure there are no issues there.

Yes, I'm still keeping this option open though I find it less obvious given the current state of system. That is to say, the periodic python job crashes every 5 minutes with a signal 11 and secondly, I can always replicate a csh crash as explained above. For that reason, I'm more open to believe that the Operating System (or any of the running processes) is the cause of it all.
 

Ravefiend

Dabbler
Joined
Jun 1, 2011
Messages
37
Perhaps a silly question to ask but does FreeNAS have any debug tools available that could allow me to troubleshoot any of the crashing processes?
 
Status
Not open for further replies.
Top