Hi all,
First time poster, very new to TrueNAS, and should probably get a couple things out of the way:
Everything works great! ...for around one to three days. The web portal is accessible, SSH connections work, my sync and snapshot jobs run correctly, my NFS shares are available, and the MySQL DB I'm running in a jail does its job perfectly.
Then, at some point I can't determine exactly, the system becomes completely unresponsive. The web portal and SSH connections time out, none of the scheduled tasks happen, and the machine won't even respond to the hardware power button (which usually triggers a "graceful" shutdown). My only option is the reboot button on the case.
I've reviewed the /var/log/messages each time this happens, and I see no evidence of errors. I see messages from the boot process, a couple of "configuration reload" notices, (one for each night the machine was up) and then... another boot message.
After the last failure, I threw a graphics card in the machine and hooked up a monitor to see if that would provide anything useful, but there seems to be no video signal when the machine is in this state.
So, my question: what are my next steps for debugging? As I said, I'm inexperienced as a sysadmin, and besides looking for log messages, I don't really know where to start.
I'm guessing the response here will be "sounds like a hardware issue", and that's fair, but I'd really like to see if I can pinpoint the problem more exactly before I give up on this setup entirely.
Thank you,
- Alex
First time poster, very new to TrueNAS, and should probably get a couple things out of the way:
- I'm familiar with FreeBSD/*nixes and the command line in general, but I'm not too experienced with more complex system administration.
- My hardware (below) is definitely not a recommended configuration, because a) I was trying to put something together using (mostly) parts I already had handy and b) the silicon shortage seems to be making all the "better" equipment rather expensive.
- TrueNAS-12.0-U2.1 (Core)
- AMD Ryzen 5 1600
- ASRock B450 PRO4
- 16GB Crucial DDR4 2666 (non-ECC)
- Intel EXPI9301CTBLK PCI-E Gigabit Ethernet (mobo NIC is Realtek; not used)
- GIGABYTE HD Experience Series GeForce 210 (no integrated graphics)
- 180GB SATA SSD (boot disk)
- 3TB 7200 RPM HDD x 2
- 4TB 5400 RPM HDD x 2
Everything works great! ...for around one to three days. The web portal is accessible, SSH connections work, my sync and snapshot jobs run correctly, my NFS shares are available, and the MySQL DB I'm running in a jail does its job perfectly.
Then, at some point I can't determine exactly, the system becomes completely unresponsive. The web portal and SSH connections time out, none of the scheduled tasks happen, and the machine won't even respond to the hardware power button (which usually triggers a "graceful" shutdown). My only option is the reboot button on the case.
I've reviewed the /var/log/messages each time this happens, and I see no evidence of errors. I see messages from the boot process, a couple of "configuration reload" notices, (one for each night the machine was up) and then... another boot message.
After the last failure, I threw a graphics card in the machine and hooked up a monitor to see if that would provide anything useful, but there seems to be no video signal when the machine is in this state.
So, my question: what are my next steps for debugging? As I said, I'm inexperienced as a sysadmin, and besides looking for log messages, I don't really know where to start.
I'm guessing the response here will be "sounds like a hardware issue", and that's fair, but I'd really like to see if I can pinpoint the problem more exactly before I give up on this setup entirely.
Thank you,
- Alex