SOLVED Server no issues 1 year, install truenas, crashes daily to maybe 3 days

lknite

Cadet
Joined
Jun 19, 2022
Messages
8
Are there options which can be tweeked which are known to sometimes stop a system from crashing?

Was using server with windows server 2019 for a year and using it to host a cifs share. Installed truenas with no problems for about a month then it crashed and a couple days later again, and now more frequently.

It's a server i put together from a previous workstation, nothing special. Random errors like this make me think maybe a hardware issue but I've seen some posts where people sat, "oh ya, truenas isn't expected to work with that", so I wonder kind of where to get started. Truenas has been a perfect solution along with democratic-csi for my use with kubernetes. I'd like to keep this server going if possible don't really have the funds to buy something new right now. And even if i went for it I notice the delay when ordering a truenas system might even be 5 or 6 weeks...

Open to ideas to try out. Thank you ahead of time.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Please provide the hardware details of your installation, per the Forum Rules.
 

lknite

Cadet
Joined
Jun 19, 2022
Messages
8
Code:
root@truenas[~]# dmidecode --type baseboard
# dmidecode 3.3
Scanning /dev/mem for entry point.
SMBIOS 2.4 present.

Handle 0x0002, DMI type 2, 8 bytes
Base Board Information
        Manufacturer: Gigabyte Technology Co., Ltd.
        Product Name: GA-990FXA-UD3
        Version: 
        Serial Number: 

root@truenas[~]# dmidecode --type memory   
# dmidecode 3.3
Scanning /dev/mem for entry point.
SMBIOS 2.4 present.

Handle 0x0005, DMI type 5, 24 bytes
Memory Controller Information
        Error Detecting Method: 64-bit ECC
        Error Correcting Capabilities:
                None
        Supported Interleave: One-way Interleave
        Current Interleave: One-way Interleave
        Maximum Memory Module Size: 1024 MB
        Maximum Total Memory Size: 4096 MB
        Supported Speeds:
                70 ns
                60 ns
        Supported Memory Types:
                Standard
                EDO
        Memory Module Voltage: 3.3 V
        Associated Memory Slots: 4
                0x0006
                0x0007
                0x0008
                0x0009
        Enabled Error Correcting Capabilities:
                None

Handle 0x0006, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: A0
        Bank Connections: 1
        Current Speed: 53 ns
        Type: Other Unknown EDO
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x0007, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: A1
        Bank Connections: 2
        Current Speed: 53 ns
        Type: Other Unknown EDO
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x0008, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: A2
        Bank Connections: 3
        Current Speed: 53 ns
        Type: Other Unknown EDO
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x0009, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: A3
        Bank Connections: 4
        Current Speed: 53 ns
        Type: Other Unknown EDO
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x0029, DMI type 16, 15 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: None
        Maximum Capacity: 16 GB
        Error Information Handle: Not Provided
        Number Of Devices: 4

Handle 0x002A, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x0029
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: A0
        Bank Locator: Bank0/1
        Type: Unknown
        Type Detail: None
        Speed: 1333 MT/s
        Manufacturer: 
        Serial Number: 
        Asset Tag: 
        Part Number: 

Handle 0x002B, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x0029
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: A1
        Bank Locator: Bank2/3
        Type: Unknown
        Type Detail: None
        Speed: 1333 MT/s
        Manufacturer: 
        Serial Number: 
        Asset Tag: 
        Part Number: 

Handle 0x002C, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x0029
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: A2
        Bank Locator: Bank4/5
        Type: Unknown
        Type Detail: None
        Speed: 1333 MT/s
        Manufacturer: 
        Serial Number: 
        Asset Tag: 
        Part Number: 

Handle 0x002D, DMI type 17, 27 bytes
Memory Device
        Array Handle: 0x0029
        Error Information Handle: Not Provided
        Total Width: 64 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: A3
        Bank Locator: Bank6/7
        Type: Unknown
        Type Detail: None
        Speed: 1333 MT/s
        Manufacturer: 
        Serial Number: 
        Asset Tag: 
        Part Number: 

root@truenas[~]# dmidecode --type processor
# dmidecode 3.3
Scanning /dev/mem for entry point.
SMBIOS 2.4 present.

Handle 0x0004, DMI type 4, 35 bytes
Processor Information
        Socket Designation: Socket M2
        Type: Central Processor
        Family: Athlon
        Manufacturer: AMD
        ID: 20 0F 60 00 FF FB 8B 17
        Signature: Family 21, Model 2, Stepping 0
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
                TSC (Time stamp counter)
                MSR (Model specific registers)
                PAE (Physical address extension)
                MCE (Machine check exception)
                CX8 (CMPXCHG8 instruction supported)
                APIC (On-chip APIC hardware supported)
                SEP (Fast system call)
                MTRR (Memory type range registers)
                PGE (Page global enable)
                MCA (Machine check architecture)
                CMOV (Conditional move instruction supported)
                PAT (Page attribute table)
                PSE-36 (36-bit page size extension)
                CLFSH (CLFLUSH instruction supported)
                MMX (MMX technology supported)
                FXSR (FXSAVE and FXSTOR instructions supported)
                SSE (Streaming SIMD extensions)
                SSE2 (Streaming SIMD extensions 2)
                HTT (Multi-threading)
        Version: AMD FX(tm)-8350 Eight-Core Processor           
        Voltage: 6.3 V
        External Clock: 200 MHz
        Max Speed: 3200 MHz
        Current Speed: 4000 MHz
        Status: Populated, Enabled
        Upgrade: ZIF Socket
        L1 Cache Handle: 0x000A
        L2 Cache Handle: 0x000C
        L3 Cache Handle: Not Provided
        Serial Number: 
        Asset Tag: 
        Part Number: 
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Are all overclocking options in the BIOS disabled?

Is the BIOS soft RAID mode disabled, and all drive ports in AHCI mode?
 

lknite

Cadet
Joined
Jun 19, 2022
Messages
8
I'll check.

Also, while I said crashing, it occured to me that when I power down it would take a moment like it was actually shutting down. I hooked up a monitor and it doesn't appear to have crashed, it's still there just can't reach it anymore. Maybe it a networking issue.

I'm using the 2.5 gig onboard nic. r8125 driver. I think I read an article saying not to use a 2.5gb nic with truenas, but it was working so well I thought maybe I'd be ok...
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Of course, there's a crappy Realtek NIC involved. TrueNAS doesn't have great stability with Realtek NICs, especially those >1G speed. For 1G NICs, the best stability is with Intel 1000/PRO NICs. For >1G NICs, either the Intel X5xx NICs or Chelsios have a solid record of stability.
 

lknite

Cadet
Joined
Jun 19, 2022
Messages
8
I didn't see any overclocking options in the BIOS, or soft RAID in the BIOS, nor did I see anything about AHCI.

However, I think we might be going in the right direction about the network card. I'm going to locate an intel 1000pro nic and try that.

Will report back the results ...
 

lknite

Cadet
Joined
Jun 19, 2022
Messages
8
Ok, I've seen some evidence that we may have got the right solution. Before, when the network connection stopped working I saw several iscsi errors appear on the console. Now, when I was disabling the network interface so I could setup the new one I saw the same errors, so those seem related to the nic going down. I'm now up using a different 1gig nic. I'll consider the issue fixed if it lasts a week. Will post back after a week or sooner if it goes down again.
 
Top