[SOLVED] Daily resets at 6am UTC after upgrade to 9.3-RELEASE-p31

Status
Not open for further replies.

chertel

Cadet
Joined
Feb 24, 2016
Messages
5
Hello FreeNAS community,

last saturday, we updated two of our storage systems to FreeNAS 9.3-RELEASE-p31 and since then, one of these systems resets itself every day exactly at 6am UTC (local timezone is CET).

Unfortunately we were not yet able to discover what is causing these resets.
This morning, I was logged in and tried to observe what is going on when the system reset itself.

Here are the last lines of /var/log/cron before the reset:

Feb 25 07:00:00 cloudstorage02 /usr/sbin/cron[56205]: (root) CMD (/usr/local/bin/python /usr/local/bin/mfistatus.py > /dev/null 2>&1)
Feb 25 07:00:00 cloudstorage02 /usr/sbin/cron[56206]: (root) CMD (/bin/sh /usr/local/sbin/save_rrds.sh > /dev/null 2>&1)
Feb 25 07:00:00 cloudstorage02 /usr/sbin/cron[56207]: (operator) CMD (/usr/libexec/save-entropy > /dev/null 2>&1)
Feb 25 07:00:00 cloudstorage02 /usr/sbin/cron[56208]: (root) CMD (/usr/libexec/atrun > /dev/null 2>&1)
Here are the top 3 process in top output before the reset:

56248 root 1 103 0 36952K 11256K CPU6 6 0:41 98.58% bsdtar
4111 root 6 20 0 466M 105M usem 7 0:24 1.56% python2.7
4116 root 1 52 0 208M 35572K select 1 1:14 0.29% python2.7​

There are absolutely NO messages in /var/log/messages shortly before the reset, also dmesg does not show anything suspicious.

While I was waiting for the reset this morning, I saw that the RRD graphs in the reporting section of the Web-UI showed values for the timeframe between 5am UTC and 6am UTC. Then, after the reset happened, this timeframe was empty in all RRD graphs, and the first new values were starting at 6:07am UTC when the system was completely up again after the reset.

These resets occur really every day, but only on one of our two storage systems. The hardware of these systems is different but both are running FreeNAS 9.3-RELEASE-p31.

Does anybody have any idea what could be causing these resets? At least any idea on how to dig further could be helpful...

Thank you very much in advance!

Best regards,
Christian
 

chertel

Cadet
Joined
Feb 24, 2016
Messages
5
Here is some more information from dmidecode that might be helpful:

BIOS Information
Vendor: American Megatrends Inc.
Version: 2.1
Release Date: 03/17/2012​

System Information
Manufacturer: Supermicro
Product Name: X8DT3

Processor Information
Socket Designation: CPU 2
Type: Central Processor
Family: Xeon
Manufacturer: Intel
ID: C2 06 02 00 FF FB EB BF
Signature: Type 0, Family 6, Model 44, Stepping 2
Flags:
FPU (Floating-point unit on-chip)
VME (Virtual mode extension)
DE (Debugging extension)
PSE (Page size extension)
TSC (Time stamp counter)
MSR (Model specific registers)
PAE (Physical address extension)
MCE (Machine check exception)
CX8 (CMPXCHG8 instruction supported)
APIC (On-chip APIC hardware supported)
SEP (Fast system call)
MTRR (Memory type range registers)
PGE (Page global enable)
MCA (Machine check architecture)
CMOV (Conditional move instruction supported)
PAT (Page attribute table)
PSE-36 (36-bit page size extension)
CLFSH (CLFLUSH instruction supported)
DS (Debug store)
ACPI (ACPI supported)
MMX (MMX technology supported)
FXSR (FXSAVE and FXSTOR instructions supported)
SSE (Streaming SIMD extensions)
SSE2 (Streaming SIMD extensions 2)
SS (Self-snoop)
HTT (Multi-threading)
TM (Thermal monitor supported)
PBE (Pending break enabled)
Version: Intel(R) Xeon(R) CPU E5603 @ 1.60GHz
Voltage: 1.2 V
External Clock: 133 MHz
Max Speed: 1600 MHz
Current Speed: 1600 MHz
Status: Populated, Enabled
Upgrade: Other
L1 Cache Handle: 0x0005
L2 Cache Handle: 0x0006
L3 Cache Handle: 0x0007
Serial Number: To Be Filled By O.E.M.
Asset Tag: To Be Filled By O.E.M.
Part Number: To Be Filled By O.E.M.
Core Count: 4
Core Enabled: 4
Thread Count: 4
Characteristics:
64-bit capable
lspci:

00:00.0 Host bridge: Intel Corporation 5520 I/O Hub to ESI Port (rev 22)
00:01.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 1 (rev 22)
00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 3 (rev 22)
00:05.0 PCI bridge: Intel Corporation 5520/X58 I/O Hub PCI Express Root Port 5 (rev 22)
00:07.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 7 (rev 22)
00:08.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI Express Root Port 8 (rev 22)
00:09.0 PCI bridge: Intel Corporation 7500/5520/5500/X58 I/O Hub PCI Express Root Port 9 (rev 22)
00:0d.0 Host bridge: Intel Corporation Device 343a (rev 22)
00:0d.1 Host bridge: Intel Corporation Device 343b (rev 22)
00:0d.2 Host bridge: Intel Corporation Device 343c (rev 22)
00:0d.3 Host bridge: Intel Corporation Device 343d (rev 22)
00:0d.4 Host bridge: Intel Corporation 7500/5520/5500/X58 Physical Layer Port 0 (rev 22)
00:0d.5 Host bridge: Intel Corporation 7500/5520/5500 Physical Layer Port 1 (rev 22)
00:0d.6 Host bridge: Intel Corporation Device 341a (rev 22)
00:0e.0 Host bridge: Intel Corporation Device 341c (rev 22)
00:0e.1 Host bridge: Intel Corporation Device 341d (rev 22)
00:0e.2 Host bridge: Intel Corporation Device 341e (rev 22)
00:0e.4 Host bridge: Intel Corporation Device 3439 (rev 22)
00:13.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub I/OxAPIC Interrupt Controller (rev 22)
00:14.0 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub System Management Registers (rev 22)
00:14.1 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub GPIO and Scratch Pad Registers (rev 22)
00:14.2 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Control Status and RAS Registers (rev 22)
00:14.3 PIC: Intel Corporation 7500/5520/5500/X58 I/O Hub Throttle Registers (rev 22)
00:16.0 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.1 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.2 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.3 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.4 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.5 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.6 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:16.7 System peripheral: Intel Corporation 5520/5500/X58 Chipset QuickData Technology Device (rev 22)
00:1a.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #4
00:1a.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #5
00:1a.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #6
00:1a.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #2
00:1c.0 PCI bridge: Intel Corporation 82801JI (ICH10 Family) PCI Express Root Port 1
00:1d.0 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #1
00:1d.1 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #2
00:1d.2 USB controller: Intel Corporation 82801JI (ICH10 Family) USB UHCI Controller #3
00:1d.7 USB controller: Intel Corporation 82801JI (ICH10 Family) USB2 EHCI Controller #1
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
00:1f.0 ISA bridge: Intel Corporation 82801JIR (ICH10R) LPC Interface Controller
00:1f.2 IDE interface: Intel Corporation 82801JI (ICH10 Family) 4 port SATA IDE Controller #1
00:1f.3 SMBus: Intel Corporation 82801JI (ICH10 Family) SMBus Controller
00:1f.5 IDE interface: Intel Corporation 82801JI (ICH10 Family) 2 port SATA IDE Controller #2
01:03.0 VGA compatible controller: Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
04:00.0 SCSI storage controller: LSI Logic / Symbios Logic MegaRAID SAS 8208ELP/8208ELP (rev 08)
06:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
06:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (rev 01)
08:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
08:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)​

top header:

last pid: 19157; load averages: 0.14, 0.22, 0.12 up 0+10:00:06 17:03:01
42 processes: 1 running, 41 sleeping
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 162M Active, 551M Inact, 20G Wired, 2184M Free
ARC: 18G Total, 2020M MFU, 14G MRU, 109K Anon, 922M Header, 384M Other
Swap: 22G Total, 22G Free
ps:

USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 11 800.0 0.0 0 128 ?? RL 7:03AM 4762:56.08 [idle]
root 4169 0.1 0.1 240612 20576 ?? Ss 7:05AM 3:15.51 /usr/local/sbin/collectd
root 0 0.0 0.0 0 9664 ?? DLs 7:03AM 13:39.21 [kernel]
root 1 0.0 0.0 6276 452 ?? ILs 7:03AM 0:00.05 /sbin/init --
root 2 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [crypto]
root 3 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [crypto returns]
root 4 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [mpt_recovery0]
root 5 0.0 0.0 0 64 ?? DL 7:03AM 0:09.68 [ctl]
root 6 0.0 0.0 0 160 ?? DL 7:03AM 0:50.41 [zfskern]
root 7 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [xpt_thrd]
root 8 0.0 0.0 0 16 ?? DL 7:03AM 0:00.47 [enc_daemon0]
root 9 0.0 0.0 0 16 ?? DL 7:03AM 0:00.38 [pagedaemon]
root 10 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [audit]
root 12 0.0 0.0 0 944 ?? WL 7:03AM 2:46.85 [intr]
root 13 0.0 0.0 0 128 ?? DL 7:03AM 0:00.00 [ng_queue]
root 14 0.0 0.0 0 48 ?? DL 7:03AM 2:01.17 [geom]
root 15 0.0 0.0 0 16 ?? DL 7:03AM 0:06.11 [yarrow]
root 16 0.0 0.0 0 512 ?? DL 7:03AM 0:01.03 [usb]
root 17 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [vmdaemon]
root 18 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [pagezero]
root 19 0.0 0.0 0 16 ?? DL 7:03AM 0:00.13 [bufdaemon]
root 20 0.0 0.0 0 16 ?? DL 7:03AM 0:02.38 [syncer]
root 21 0.0 0.0 0 16 ?? DL 7:03AM 0:00.14 [vnlru]
root 22 0.0 0.0 0 16 ?? DL 7:03AM 0:00.19 [softdepflush]
root 288 0.0 0.0 0 16 ?? DL 7:03AM 0:00.00 [g_mp_kt]
root 949 0.0 0.0 0 16 ?? DL 7:04AM 0:00.00 [ftcleanup]
root 1015 0.0 0.0 0 16 ?? DL 7:04AM 0:59.45 [ipmi0: kcs]
root 1776 0.0 0.0 14236 1496 ?? Is 7:04AM 0:00.00 /usr/sbin/moused -p /dev/ums0 -t auto -I /var/run/moused.ums0.pid
root 2370 0.0 0.0 6280 404 ?? Ss 7:04AM 0:00.02 /sbin/devd
root 2805 0.0 0.0 35876 2976 ?? I 7:05AM 0:00.00 /usr/local/sbin/syslog-ng -p /var/run/syslog.pid
root 2806 0.0 0.0 123112 11272 ?? Is 7:05AM 0:00.95 /usr/local/sbin/syslog-ng -p /var/run/syslog.pid
root 2809 0.0 0.0 12056 8036 ?? Ss 7:05AM 0:00.28 /usr/sbin/watchdogd
root 2844 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[0] da0p1]
root 2845 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da0p1]
root 2846 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[2] da0p1]
root 2847 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da0p1]
root 2848 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da0p1]
root 2849 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da0p1]
root 2850 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da0p1]
root 2851 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da0p1]
root 2853 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da1p1]
root 2854 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da1p1]
root 2855 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[2] da1p1]
root 2856 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[3] da1p1]
root 2857 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da1p1]
root 2858 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da1p1]
root 2859 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da1p1]
root 2860 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[7] da1p1]
root 2862 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da2p1]
root 2863 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[1] da2p1]
root 2864 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[2] da2p1]
root 2865 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da2p1]
root 2866 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da2p1]
root 2867 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[5] da2p1]
root 2868 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da2p1]
root 2869 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[7] da2p1]
root 2871 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da3p1]
root 2872 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da3p1]
root 2873 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[2] da3p1]
root 2874 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[3] da3p1]
root 2875 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da3p1]
root 2876 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da3p1]
root 2877 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da3p1]
root 2878 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da3p1]
root 2880 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da4p1]
root 2881 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da4p1]
root 2882 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[2] da4p1]
root 2883 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[3] da4p1]
root 2884 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[4] da4p1]
root 2885 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da4p1]
root 2886 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da4p1]
root 2887 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da4p1]
root 2889 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da5p1]
root 2890 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da5p1]
root 2891 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[2] da5p1]
root 2892 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da5p1]
root 2893 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[4] da5p1]
root 2894 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da5p1]
root 2895 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da5p1]
root 2896 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da5p1]
root 2898 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da6p1]
root 2899 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[1] da6p1]
root 2900 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[2] da6p1]
root 2901 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da6p1]
root 2902 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da6p1]
root 2903 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da6p1]
root 2904 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da6p1]
root 2905 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da6p1]
root 2907 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[0] da7p1]
root 2908 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[1] da7p1]
root 2909 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[2] da7p1]
root 2910 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da7p1]
root 2911 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da7p1]
root 2912 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da7p1]
root 2913 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da7p1]
root 2914 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da7p1]
root 2916 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da8p1]
root 2917 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da8p1]
root 2918 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[2] da8p1]
root 2919 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[3] da8p1]
root 2920 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da8p1]
root 2921 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[5] da8p1]
root 2922 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da8p1]
root 2923 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da8p1]
root 2925 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da10p1]
root 2926 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da10p1]
root 2927 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[2] da10p1]
root 2928 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da10p1]
root 2929 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da10p1]
root 2930 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[5] da10p1]
root 2931 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[6] da10p1]
root 2932 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da10p1]
root 2934 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[0] da9p1]
root 2935 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[1] da9p1]
root 2936 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[2] da9p1]
root 2937 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[3] da9p1]
root 2938 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[4] da9p1]
root 2939 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[5] da9p1]
root 2940 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [g_eli[6] da9p1]
root 2941 0.0 0.0 0 16 ?? DL 7:05AM 0:00.01 [g_eli[7] da9p1]
root 3012 0.0 0.0 24668 3280 ?? Is 7:05AM 0:00.00 /usr/sbin/ctld
root 3109 0.0 0.0 18280 1692 ?? Is 7:05AM 0:00.26 /usr/sbin/rpcbind
root 3113 0.0 0.0 28484 1608 ?? Is 7:05AM 0:00.03 /usr/sbin/mountd -l -rS /etc/exports /etc/zfs/exports
root 3127 0.0 0.0 22252 1472 ?? Is 7:05AM 0:00.04 nfsd: master (nfsd)
root 3128 0.0 0.0 9916 1552 ?? I 7:05AM 0:16.16 nfsd: server (nfsd)
root 3139 0.0 0.0 286448 1504 ?? Is 7:05AM 0:00.02 /usr/sbin/rpc.statd
root 3150 0.0 0.0 22332 1564 ?? Ss 7:05AM 0:00.05 /usr/sbin/rpc.lockd
root 3164 0.0 0.0 0 16 ?? DL 7:05AM 0:00.00 [Timer]
root 3516 0.0 0.0 26384 3472 ?? Ss 7:05AM 0:01.25 /usr/sbin/ntpd -g -c /etc/ntp.conf -p /var/run/ntpd.pid -f /var/db/ntpd.drift
nobody 3733 0.0 0.0 40692 4692 ?? Ss 7:05AM 0:00.77 proftpd: (accepting connections) (proftpd)
root 3834 0.0 0.0 28120 3876 ?? I 7:05AM 0:00.18 /usr/local/sbin/smartd -i 1800 -c /usr/local/etc/smartd.conf -p /var/run/smartd.pid
root 4023 0.0 0.0 30500 3936 ?? Is 7:05AM 0:00.00 nginx: master process /usr/local/sbin/nginx
www 4024 0.0 0.0 30500 3864 ?? I 7:05AM 0:00.10 nginx: worker process (nginx)
nobody 4029 0.0 0.0 14064 1764 ?? Is 7:05AM 0:00.01 /usr/local/sbin/mdnsd
root 4034 0.0 0.0 22924 2116 ?? Ss 7:05AM 0:00.28 ladvd: master [priv] (ladvd)
ladvd 4036 0.0 0.0 22924 2012 ?? S 7:05AM 0:00.39 ladvd: child (ladvd)
root 4115 0.0 0.2 435236 53300 ?? I 7:05AM 0:12.42 /usr/local/bin/python -R /usr/local/www/freenasUI/manage.py runfcgi method=threaded host=127.0.0.1 port=9042 pidfile=/var/run/django.pid (python2.7)
root 4120 0.0 0.2 213284 44284 ?? S 7:05AM 0:35.64 python: alertd (python2.7)
messagebus 4127 0.0 0.0 18460 1960 ?? Is 7:05AM 0:00.00 /usr/local/bin/dbus-daemon --system
root 4335 0.0 0.0 53392 4568 ?? Is 7:05AM 0:00.00 /usr/sbin/sshd
root 4426 0.0 0.0 12056 1444 ?? Is 7:05AM 0:00.00 daemon: /usr/local/libexec/nas/register_mdns.py[4427] (daemon)
root 4427 0.0 0.1 182564 13656 ?? I 7:05AM 0:02.42 /usr/local/bin/python /usr/local/libexec/nas/register_mdns.py (python2.7)
root 5228 0.0 0.0 18296 1828 ?? Is 7:06AM 0:00.09 /usr/sbin/cron -s
root 5472 0.0 0.0 45020 3688 ?? I 7:06AM 0:00.05 /sbin/zfsd -d zfsd
root 18576 0.0 0.0 76092 6208 ?? Ss 4:38PM 0:00.13 sshd: root@pts/0 (sshd)
root 19099 0.0 0.0 3784 1544 ?? IN 5:00PM 0:00.00 sleep 300
root 3853 0.0 0.0 18604 2076 v0- IN 7:05AM 0:00.19 /bin/sh /usr/local/sbin/pbid
root 5464 0.0 0.1 178100 14912 v0 Is+ 7:06AM 0:02.48 python /etc/netcli (python2.7)
root 5465 0.0 0.0 12056 1480 v1 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv1
root 5466 0.0 0.0 12056 1480 v2 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv2
root 5467 0.0 0.0 12056 1480 v3 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv3
root 5468 0.0 0.0 12056 1480 v4 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv4
root 5469 0.0 0.0 12056 1480 v5 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv5
root 5470 0.0 0.0 12056 1480 v6 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv6
root 5471 0.0 0.0 12056 1480 v7 Is+ 7:06AM 0:00.00 /usr/libexec/getty Pc ttyv7
root 18593 0.0 0.0 21680 3428 0 Ss 4:38PM 0:00.08 -csh (csh)
root 19188 0.0 0.0 16264 2064 0 R+ 5:04PM 0:00.00 ps waux
 

depasseg

FreeNAS Replicant
Joined
Sep 16, 2014
Messages
2,874
Just a wild guess - but IPMI hw watchdog timeout issue? I've seen a couple of those on here.
 

chertel

Cadet
Joined
Feb 24, 2016
Messages
5
Just a wild guess - but IPMI hw watchdog timeout issue? I've seen a couple of those on here.
Yes, I already thought about that so I tried to stop watchdogd using

/etc/rc.d/watchdogd stop
but the system still resets itself :(

The IPMI configuration also was not changed in any way, so I guess the resets should have happened with the former FreeNAS version too, right?
But instead the system was running without any issues for several months.
 

chertel

Cadet
Joined
Feb 24, 2016
Messages
5
Hi,

the issue has been solved. We found the interesting message in cat /data/crash/info.0:

root@build3.ixsystems.com:/tank/home/stable-builds/FN/objs/os-base/amd64/tank/home/stable-builds/FN/FreeBSD/src/syskmem_malloc(16744448): kmem_map too small: 14006796288 total allocated Panic String: kmem_malloc(16744448): kmem_map too small: 14006796288 total allocated
After googling for "kmem_map_too small" we came to https://bugs.freenas.org/issues/13061 and got the clue to enable autotune, which was disabled by default.

Solution that worked in our case:
Now that we have enabled autotune, the system did not reset this morning, so it seems like the issue is gone.

Autotune is still disabled on our other FreeNAS system (which has only 6GB RAM, the one we had issues with has 24GB RAM), but no issues so far.

Anyway, thank you depasseg for your help.
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
Glad you solved it! I can't remember the amount of RAM at which it's recommended to run autotune. But this was a good reminder though.
 

Robert Trevellyan

Pony Wrangler
Joined
May 16, 2014
Messages
3,778
And yet people still question 8GB as the required minimum...
we have enabled autotune, the system did not reset this morning, so it seems like the issue is gone
According to the documentation:
It should only be used as a temporary measure on a system that hangs until the underlying hardware issue is addressed by adding more RAM. Autotune will always slow the system down as it caps the ARC.
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
Yea, except the one he had problems with was the 24GB system, not the 6GB system. :)
 

chertel

Cadet
Joined
Feb 24, 2016
Messages
5
And yet people still question 8GB as the required minimum...

Yes, but surprisingly the system we had issues with has 24 GB RAM, the system with only 6 GB RAM is working fine so far (and without autotune) ;)


In our case, autotune generated the following entries in System/Tunables:

kern.ipc.maxsockbuf = 2097152
net.inet.tcp.delayed_ack = 0
net.inet.tcp.recvbuf_max = 2097152
net.inet.tcp.sendbuf_max = 2097152
vfs.zfs.arc_max = 22242750592
vfs.zfs.l2arc_headroom = 2
vfs.zfs.l2arc_noprefetch = 0
vfs.zfs.l2arc_norw = 0
vfs.zfs.l2arc_write_boost = 0
vfs.zfs.l2arc_write_max = 10000000
vm.kmem_size = 32167470080
I have no idea which one of these values solved our issue but since we enabled autotune and rebooted the system, it is stable and works fine again :)
 
Status
Not open for further replies.
Top