Random system crashes

Status
Not open for further replies.

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
I've had this happen a few times. Early on, a few months ago, when I was setting up my FN system. Then last night, and middle of the day today.

The system would randomly shutdown. Yesterday and today, it went into reboot cycles for no apparent reason. The only common thread I found was that I was copying data or downloading data with SABNZBd. Which I've done plenty times before yesterday and am currently doing it as we speak.

I'm running an Optiplex 980 with 8 GB of RAM.

Any ideas?
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
The NIC overheats? Some other part of the motherboard chipset overheats/is unstable?
 

DrKK

FreeNAS Generalissimo
Joined
Oct 15, 2013
Messages
3,630
It's possible. Should I try a PCI based NIC?
If you had one laying around, sure. But I wouldn't BUY one just to test this out. If you didn't have one handy, I'd do more forum/google searching on possible explanations.
 

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
happened again last night. After a download completed. I doubt it is overheating. BUt i can pop the chassis cover off to see if that helps.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
Lack of RAM? What else is installed/active on your box?

This is not a solution to overheating. It screws up the airflow and can make things worse.

The only things installed are a couple plugins. sonarr, sab,Mylar, a sql instance, cp.

I think it happens during SAB unpacking. So maybe not network throughput so much as general overheating.

Not sure how to improve it. No extra plugs for adding an extra fan. I can't remove the discrete graphics card since it is an i7. I'm considering a new chassis and mobo.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
a couple plugins
I don't know what those particular plugins are like for RAM usage, but I believe the 8GB minimum requirement for FreeNAS is generally considered to be before adding plugins.
EDIT: wait a minute, is 'cp' CrashPlan? That's a memory hog.
I think it happens during SAB unpacking.
Have you tracked RAM usage during this operation?
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
"Random hard shutdown" usually implies a failing component. Verify your hardware is OK (memtest/apply load) and see if it dies under non-FreeNAS-induced stress.

Also, standard disclaimer of "non-ECC RAM equals non-safe data."
 

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
I don't know what those particular plugins are like for RAM usage, but I believe the 8GB minimum requirement for FreeNAS is generally considered to be before adding plugins.
EDIT: wait a minute, is 'cp' CrashPlan? That's a memory hog.

Have you tracked RAM usage during this operation?
CP is couch potato. I couldn't get Crashplan working.

I may kick off a cron job to log memory usage to a file so I can see what's up. Maybe check it every minute or so.
 

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
"Random hard shutdown" usually implies a failing component. Verify your hardware is OK (memtest/apply load) and see if it dies under non-FreeNAS-induced stress.

Also, standard disclaimer of "non-ECC RAM equals non-safe data."
Any suggestions for applying load? I know there are mem testing tools, never used anything else.

About the non-ecc. Any chance that would show in a config somewhere or do I have to pull them? I've been stable for 2+ days now, interestingly.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Just try a Memtest live-CD and see if that takes it down. Check your internal cabling as well, reseat components such as RAM, etc.

Non-ECC is a definite because you're using an Optiplex 980 which has no ability to handle ECC memory, it won't even POST with it in. If all you're storing is downloaded/ripped media, then it's not an issue since all you'll lose is your time and bandwidth, but if you have anything of actual value (eg: family photos) then you will want to protect them properly.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
I may kick off a cron job to log memory usage to a file so I can see what's up
The way ZFS caching works is that it gradually consumes as much RAM as is available, except for a relatively small reserve that depends on total installed RAM. If the system needs to allocate RAM for something else, ZFS will shrink the cache and give up some RAM, but there can be circumstances under which the release doesn't happen quickly enough and the other application ends up with a failed allocation. So, my hypothesis is that with several plugins running on a system that only just has the minimum requirement of 8GB, and with a fully warmed up cache, SAB suddenly asks for a big allocation to unpack something, doesn't get it as quickly as expected, and things go badly from there.
 

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
I did forget Plex running as well. Haven't had a chance to do a memtest.

Code:
[root@<hostname> /mnt/<pool>/media/downloads# top -na
last pid: 60081;  load averages:  0.04,  0.03,  0.00  up 2+13:44:03  20:42:15
82 processes:  1 running, 81 sleeping

Mem: 914M Active, 581M Inact, 6049M Wired, 27M Cache, 259M Free
ARC: 4999M Total, 2742M MFU, 1386M MRU, 757K Anon, 61M Header, 810M Other
Swap: 2048M Total, 13M Used, 2035M Free


  PID USERNAME  THR PRI NICE  SIZE  RES STATE  C  TIME  WCPU COMMAND
13339  88  19  21  0  482M 82620K sbwait  1  2:51  0.49% /usr/local/libexec/mysqld --defaults-extra-file=/var/db/m
11037 media  15  20  0  380M  155M usem  1  47:08  0.00% /usr/pbi/sonarr-amd64/bin/mono /usr/pbi/sonarr-amd64/shar
32695 root  1  20  0  347M 26036K select  5  26:41  0.00% /usr/local/sbin/smbd --daemon --configfile=/usr/local/etc
8666 media  33  20  0  423M  101M usem  6  18:00  0.00% /usr/pbi/sabnzbd-amd64/bin/python2.7 /usr/pbi/sabnzbd-amd
2774 root  12  20  0  209M 13768K uwait  2  9:49  0.00% /usr/local/sbin/collectd
9812 media  20  20  0  470M  118M select  7  3:29  0.00% /usr/pbi/sickbeard-amd64/bin/python2.7 /usr/pbi/sickbeard
8039  972  13  35  15  453M 65504K select  3  3:21  0.00% [python]
5013 media  2  20  0  312M 34516K select  6  2:28  0.00% /usr/pbi/couchpotato-amd64/bin/python2.7 /usr/pbi/couchpo
7415  972  14  20  0  333M 74508K uwait  6  1:36  0.00% /usr/pbi/plexmediaserver-amd64/share/plexmediaserver/Plex
2692 root  1  52  0  202M 31236K select  3  1:25  0.00% python: alertd (python2.7)
2688 root  6  20  0  469M  189M usem  5  1:18  0.00% /usr/local/bin/python -R /usr/local/www/freenasUI/manage.
1868 root  4  20  0  9916K  1504K rpcsvc  4  1:09  0.00% nfsd: server (nfsd)
6263 media  17  22  0  234M 44932K usem  3  0:56  0.00% /usr/pbi/mylar-amd64/bin/python2.7 /usr/pbi/mylar-amd64/s
7663 root  6  20  0  147M 22604K usem  3  0:13  0.00% /usr/pbi/plexmediaserver-amd64/bin/python2.7 /usr/pbi/ple
11285 root  6  20  0  152M 22988K usem  1  0:13  0.00% /usr/pbi/sonarr-amd64/bin/python2.7 /usr/pbi/sonarr-amd64
6510 root  6  20  0  147M 23220K usem  3  0:12  0.00% /usr/pbi/mylar-amd64/bin/python2.7 /usr/pbi/mylar-amd64/c
10060 root  6  20  0  147M 23492K usem  2  0:12  0.00% /usr/pbi/sickbeard-amd64/bin/python2.7 /usr/pbi/sickbeard
5260 root  6  20  0  147M 22836K usem  5  0:12  0.00% /usr/pbi/couchpotato-amd64/bin/python2.7 /usr/pbi/couchpo
 

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
Thanks. I can't fix that one. But it's been stable for a few days now. I will still look into adding RAM
 

pcloadetter

Explorer
Joined
Aug 15, 2015
Messages
65
So its been a few weeks since there was an system crash. Last night around 3:42 or so, the system rebooted. /var/log/messages has nothing but the boot up line.
Oct 21 03:21:34 <hostname> mountd[1855]: mount request succeeded from 192.168.10.111 for /mnt/<volume>/media
Oct 21 03:43:47 <hostname> syslog-ng[1672]: syslog-ng starting up; version='3.5.6'

I had a dump of the stats occurring every 5 minutes. I had toned it down from 1 minute. I'll change that back now. But the last stats where this:
10/21/2015 03:30:00 Mem: 819M Active, 706M Inact, 5905M Wired, 48M Cache, 352M Free
10/21/2015 03:35:00 Mem: 821M Active, 706M Inact, 5905M Wired, 48M Cache, 350M Free
10/21/2015 03:40:00 Mem: 816M Active, 706M Inact, 5905M Wired, 48M Cache, 354M Free
10/21/2015 03:50:00 Mem: 786M Active, 355M Inact, 1044M Wired, 4548K Cache, 5645M Free
10/21/2015 03:55:00 Mem: 788M Active, 365M Inact, 1049M Wired, 4548K Cache, 5628M Free


The details for 3:40
last pid: 45675; load averages: 0.04, 0.03, 0.03 up 18+20:16:35 03:40:00
84 processes: 1 running, 83 sleeping

Mem: 816M Active, 706M Inact, 5905M Wired, 48M Cache, 355M Free
ARC: 4543M Total, 691M MFU, 2903M MRU, 1754K Anon, 37M Header, 911M Other
Swap: 2048M Total, 62M Used, 1985M Free, 3% Inuse


PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
13023 media 33 21 0 500M 144M usem 3 167:15 0.00% /usr/pbi/sabnzbd-amd64/bin/python2.7 /usr/pbi/sabnzbd-amd
2907 root 12 20 0 201M 13104K uwait 4 71:27 0.00% /usr/local/sbin/collectd
8363 root 1 20 0 351M 20320K select 7 63:58 0.00% /usr/local/sbin/smbd --daemon --configfile=/usr/local/etc
16715 88 17 20 0 486M 85452K sbwait 4 20:11 0.00% /usr/local/libexec/mysqld --defaults-extra-file=/var/db/m
99739 media 15 20 0 399M 176M usem 1 17:31 0.00% /usr/pbi/sonarr-amd64/bin/mono /usr/pbi/sonarr-amd64/shar
36393 media 2 20 0 364M 91576K select 5 15:31 0.00% /usr/pbi/couchpotato-amd64/bin/python2.7 /usr/pbi/couchpo
73642 972 13 35 15 537M 93176K select 6 12:38 0.00% [python]
2693 root 1 52 0 194M 20656K select 0 7:25 0.00% python: alertd (python2.7)
2689 root 6 20 0 557M 99M usem 7 7:14 0.00% /usr/local/bin/python -R /usr/local/www/freenasUI/manage.
1870 root 4 20 0 9916K 1416K rpcsvc 6 7:02 0.00% nfsd: server (nfsd)
6174 media 17 52 0 341M 54220K usem 0 6:46 0.00% /usr/pbi/mylar-amd64/bin/python2.7 /usr/pbi/mylar-amd64/s
73641 972 15 20 0 370M 85188K uwait 1 5:42 0.00% /usr/pbi/plexmediaserver-amd64/share/plexmediaserver/Plex
7583 root 6 23 0 164M 19244K usem 2 1:24 0.00% /usr/pbi/plexmediaserver-amd64/bin/python2.7 /usr/pbi/ple
15635 root 6 47 0 168M 21060K usem 0 1:22 0.00% /usr/pbi/sonarr-amd64/bin/python2.7 /usr/pbi/sonarr-amd64
6428 root 6 23 0 155M 17600K usem 4 1:17 0.00% /usr/pbi/mylar-amd64/bin/python2.7 /usr/pbi/mylar-amd64/c
5157 root 6 23 0 156M 17016K usem 5 1:16 0.00% /usr/pbi/couchpotato-amd64/bin/python2.7 /usr/pbi/couchpo
13271 root 6 20 0 160M 17332K usem 2 1:13 0.00% /usr/pbi/sabnzbd-amd64/bin/python2.7 /usr/pbi/sabnzbd-amd
1674 root 2 20 0 116M 6772K kqread 4 0:38 0.00% /usr/local/sbin/syslog-ng -p /var/run/syslog.pid

There were no downloads, file transfers or anything else going at this time that I'm aware of, so I have no idea what could have caused it. 2-3 minutes after this, it crashed.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Are there any files in /data/crash?
 
Status
Not open for further replies.
Top