TrueNAS Scale system keeps freezing, becomes unresponsive and requires manual shutdown

Kelp07

Dabbler
Joined
Aug 14, 2022
Messages
12
Pretty new to TrueNAS, just started trying it out earlier this year. I'll list my full system specs at the end of the post. Any advice even troubleshooting ideas, is appreciated! Thank you!

Randomly, about once a day but sometimes more, sometimes less, the system stays powered on but becomes unresponsive. The web GUI is not accessible, all apps stop working, and even direct inputs do nothing (I can plug in a keyboard & monitor and the console screen is still up but inputting an option doesn't do anything). The SMB shares also go down, but sometimes the apps go down but the shares are still up, I know this because I set up a backup plex server on my laptop using the shares and sometimes it still works when the TrueNAS Scale plex app is down. When it does this, the only way to reboot is unplug or hold the power button. I've tried legacy and UEFI boot, and I have things not in use like front USB ports disabled where possible. Also, my kill-a-watt at the surge protector shows it drawing power like normal, and the fans still spin. I have not noticed it happening more often at any particular times, like during a file transfer or plex stream. It has done it once in the middle of a stream but usually I find out when I go to access plex or nextcloud, and they don't work.

I commented on another thread with a similar problem, and got the recommendation to replace my boot drive first, so I replaced the single sata SSD with a mirrored mSATA SSD and USB stick. In that process I did a reinstall but reloaded the config. I moved the system dataset to my main HDD "tank" pool. At the time I noticed this happening, I had some more storage pools for testing, so I deleted those and unplugged the drives. I've added more RAM, from 8 up to 16 GB.

Like I said, new to TrueNAS, and I would think there'd be a log file or something but the closest I can find is System Settings -> Advanced -> Save Debug. Would having that help?

Hardware
Everything was received used except for the drives. The motherboard, CPU, half the ram, GPU, all were pulled from a Dell prebuilt.

Motherboard from Dell XPS 8500 (Not sure the model #)
Core i7-3770
16 GB RAM, running at 1600 MHz
- 2x 4GB sticks of Micron PC3-12800U DDR3
- 2x 4GB sticks of G Skill PC3-12800 DDR3
Radeon HD 7570 1 GB
Very old Nvidia GPU, don't know the model #, plugged in through pcie 1x just so I could use the Radeon for the Windows VMs (which I haven't got working but I've put that problem to the side for now)
Silverstone 700W 80+ Titanium PSU
Storage drives are plugged in via pcie to SATA adapter. There's one SATA 3 port and three SATA 2 ports on the motherboard.
HDD pool is 3x 4TB Ironwolf drives, RAID-Z1 (one parity)
This has the system dataset
SSD pool is 2x Crucial BX500 1TB drives, mirrored
This is my apps pool and I have a handful of apps running including plex and nextcloud.

Side note, the kill-a-watt says it runs at around 90 - 100W almost constantly. Is that pretty normal or could it be from the oversize PSU? In Reporting, after startup the CPU is usually barely working. The original Dell prebuilt had a 460W, which I still have. Is there any way to isolate the PSU and see how much power the system is actually using vs. pulling from the wall?
 

diogen

Explorer
Joined
Jul 21, 2022
Messages
72
Basic rule of thumb I use when troubleshooting PC's problems: random reboots are caused by RAM, freezes - by graphics, storage...
You have three adapters: onboard, Radeon and NVidia. Pull the last two and try again...

Your configuration is so far from a recommended one, don't expect too much support on these forums.
 
Last edited:

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
I would start with following the previous suggestion (too much VGAs) and memtesting each stick of RAM.
Then work on getting the model of each component you are using.
Have fun and good luck.
 
Last edited:

Kelp07

Dabbler
Joined
Aug 14, 2022
Messages
12
Thank you both! I didn't consider the graphics could be causing the problem. I'll try that first.
 

Kelp07

Dabbler
Joined
Aug 14, 2022
Messages
12
Thank you again for the advice! I unplugged both graphics cards yesterday and started it back up. It was still running earlier today, but now it has frozen again. What should I check next?
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
RAM testing with Memtest. Test each stick, and possibily every socket.
Check power cables for damages and make sure they are properly connected.
Storage drives are plugged in via pcie to SATA adapter.
Please elaborate.
 

Kelp07

Dabbler
Joined
Aug 14, 2022
Messages
12
Thank you, Davvo! I'll try those.

Regarding the adapter, the motherboard doesn't have enough SATA ports on it, so I bought a pci-e sata controller... On Amazon... Pretty much exactly as that link advises against. Wish I'd done more extensive research.
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222

Kelp07

Dabbler
Joined
Aug 14, 2022
Messages
12
I'm happy to report that it's been up over 2 and a half days without freezing! I don't think it's gone longer than a day since this started.
I actually left the expansion card in, because I don't have enough sata ports on the motherboard, but moved all the drives I could to the motherboard ports and filled them up first. Now there's only one in the expansion card. But that makes sense if the problem was overloading the pci-e lanes. It can handle 1 drive, but not 5.

Thank you so much for the help! This has been giving me gray hairs.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
But that makes sense if the problem was overloading the pci-e lanes. It can handle 1 drive, but not 5.
Glad your system is working better but you should identify the actual problem. If that means plugging in the drives one by one into the PCI-E card until it fails to prove one specific port is faulty or maybe you have a bad data cable, something to be 100% certain of the cause. I doubt it's a pci-e lane issue, if it is then maybe you have a bigger issue at hand. But let's say the pci-e card cannot handle all those drives, then is it getting too hot? If that is the case then what you have done might actually keep it from overheating. You might need to move the card to another slot if possible in order to allow it the have more cooling.

But if you are fine with it running as-is, that is your risk to take. I'm just trying to advise you that you should figure out what the real cause is and make sure it doesn't come back to bite you later.
 

Kelp07

Dabbler
Joined
Aug 14, 2022
Messages
12
Thank you for pointing that out and for the troubleshooting ideas. I'm still so new at this and learning about how to troubleshoot, especially hardware. I'm not sure if I can be 100% sure of the cause because I have changed multiple variables, but I'll try to identify it. How would I go about checking the temperature? I only know how to check the cpu temp. Thanks!
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
How would I go about checking the temperature?
Well it's not very accurate but touch it. Be careful because some chips can burn you, no kidding. The more scientific method is an IR thermometer. But in reality, it if feels hot to the touch (more than just very warm), then it could be an issue.

Change one thing at a time and give the system time to fail. If it doesn't fail then make another change. In a perfect world you will be able to repeat the failure. And take notes on what you do so that if this takes 3 weeks to solve, you have good note to rely upon.

And don't overthink anything. Keep it all simple. If the problem comes back, then reverse what you did and verify the problem clears.

Of course, you could just let it run as-in since it's working but I personally would not. Some people would. It's your call.
 
Top