Exited on Signal 11 (again) and now high memory utilization.

EvanVanVan · Sep 2, 2014

Just a little background, yesterday I messed up my CIFS share, so I re-uploaded my config file and decrypted my drives. Then I transfer several 10GB+ files over. Usually I did one transfer at a time, but at the end I transferred 3 separate files at once (I mean each transfer had it's own progress window/bar).

I got another security email about exiting on signal 11 last night. Here's what the log says:

Code:

Sep  1 14:43:41 freenas winbindd[9917]:   initialize_winbindd_cache: clearing cache and re-creating with version number 2

Sep  1 20:03:20 freenas kernel: pid 9920 (smbd), uid 1001: exited on signal 11
Sep  1 20:45:37 freenas kernel: pid 12498 (smbd), uid 1001: exited on signal 11

I included the first line just to show that for 6 hours before the "exited on signal 11" everything was fine (That line at 14:43:41 has to do with when I restarted my server when re-uploading the config)

Anyway I know signal 11 usually has to deal with RAM, I had run 3-4 passes of MemTest at the beginning of August when I first got "exited on signal 11" security emails. My ram checked out fine.

This time thought I went to the Reporting screen and looked at memory. Here are the pics:

Before 17:30 in the first picture my PMU was flat at 10Gb.
I don't know exactly when I was transfering files, but it had to be about when my memory utilization goes way up in the first pic.
I included the second and third pics just to show the weird dashes in the lines (unless that's just normal, slight utilization fluctuation.)

Right now (6:20 the next day), my memory utilization is still flat-lining at the same ~24GB. The file transfers have been done since maybe 10pm last night.

I did have one error in the transfer, at the end of a 11GB file Windows popped up saying the network drive was inaccessible. The file hadn't finished transferring and I was forced to redo it...

Is that memory utilization normal? What about the exited on signal 11?

Thanks

edit: I just upgraded from 9.2.1.3 -> 9.2.1.7. I don't know if that's going to fix anything, but I'm going to transfer some files and watch memory utilization. Probably won't test it until tonight though.

anodos · Sep 2, 2014

ZFS uses your RAM as cache. Unused RAM = wasted RAM. I believe the message was about CIFS crashing. It's hard to say why that happened without looking through logs, configuration files, and testing. I'd say don't worry about it unless the issue keeps occurring.

phoenix · Sep 3, 2014

I believe there's an open bug report on the CIFS "signal 11" problem, you might want to take a look at that. I also have it occasionally but it doesn't seem to be causing me any I'll effects.

EvanVanVan · Sep 3, 2014

Ok thanks guys I'll keep an eye on it. I was just a little concerned because a month ago I got the exited on signal 11 email a couple nights in a row and ignored them. On the third night I had a Fatal Trap 12 error occur and almost lost my pool. Turned out to be a bad flash drive. So anyway this time I was trying to be a little more proactive about it.

jgreco · Sep 3, 2014

The signal 11 is merely a process (smbd) crashing. This might represent a problem for the SMB environment but shouldn't be hazardous to your data or pool (except insofar as SMB might not recover correctly from problems). A fatal trap 12 is your actual system vaporizing due to a significant kernel level issue, which represents a truly significant issue. These are very different in their severity.

Proactive is good and you are strongly encouraged to figure it out, but thought you might like a little clarification.

EvanVanVan · Sep 3, 2014

Cool thanks, clarification is good. Especially on the signal 11 that searching had led me to believe was an unidentified "ram issue."

anodos · Sep 3, 2014

Have you done any tuning / setting sysctls / samba tweaking? Sometimes these can make the system less stable. I can't think of any reason why smbd crashing would cause pool loss.

EvanVanVan · Sep 3, 2014

Yeah the issues were probably just coincidentally timed together. And once I figured out the fatal trap 12 was a bad flash drive it wouldn't have been a big deal, except I couldn't find my config and encryption geli.key backup

(I wish the only "embarrassed emoji" wasn't so damn pink.)

And no I haven't tweaked any of my settings, I'm going to keep an eye on it as I transfer files in the future. Plus look for the bug report as phoenix suggested.

jgreco · Sep 3, 2014

EvanVanVan said:
Cool thanks, clarification is good. Especially on the signal 11 that searching had led me to believe was an unidentified "ram issue."

If you're using ECC, no, it isn't. It is a symptom of a userland program doing something to cause a segmentation violation, which is basically a userland program trying a memory access that is illegal. If the memory subsystem didn't corrupt the memory (which is exceedingly unlikely with ECC), then it is most likely to be a bug in the userland program. This could still be due to corruption of the USB flash, or configuration files, but isn't expected to crash the kernel.

Important Announcement for the TrueNAS Community.

Exited on Signal 11 (again) and now high memory utilization.

EvanVanVan

Patron

anodos

Sambassador

phoenix

Explorer

EvanVanVan

Patron

jgreco

Resident Grinch

EvanVanVan

Patron

anodos

Sambassador

EvanVanVan

Patron

jgreco

Resident Grinch

Similar threads

Important Announcement for the TrueNAS Community.

Exited on Signal 11 (again) and now high memory utilization.

Patron

Sambassador

Explorer

Patron

Resident Grinch

Patron

Sambassador

Patron

Resident Grinch

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Exited on Signal 11 (again) and now high memory utilization."

Similar threads