benk87
Cadet
- Joined
- Jan 11, 2023
- Messages
- 6
First, thank you for taking the time to read.
Background:
I am happy to supply serial numbers upon request, just didn't want to dismantle the box to get to all of them:
I am happy to supply other configuration/log information as needed.
Description:
The problem will usually start within a 24 hour period of a reboot.
When the problem occurs, TrueNAS becomes unresponsive. I am unable to see it on the network, the web UI does not connect, you can't reach the login page. Attaching a monitor to the physical box, I see the following cascade of error messages in the console and the console is unresponsive to the keyboard:
Please see 'console-image.jpg' attached for more context. I would supply the full text but it's not captured in the /var/log files.
Things I've Tried
Resolutions
Currently the only thing that 'resolves' this issue is a hard reboot.
Background:
- Home system for personal documents/photos
- Currently not dependent on this build for any storage due to current issue
- I'm new to FreeBSD, but generally comfortable on the command line (windows developer)
- Cribbed this build from this video, which since visiting these forums may have been a mistake. sigh
- I have not been able to reproduce consistently which has made this hard to fix myself
- I did the following to log all console output to a log file as the below messages were not showing up anywhere in the /var/log files that I could find. They still didn't log as expected.
I am happy to supply serial numbers upon request, just didn't want to dismantle the box to get to all of them:
- ASUS ROG Strix B550-I Gaming AMD AM4
- AMD Ryzen 3 3100 4-Core
- SilverStone Technology ECS07 (Provide 2 more sata ports, 2 of the WD Gold disks are on this)
- G.Skill RipJaws V Series 16GB (2 x 8GB) 288-Pin SDRAM PC4-28800 DDR4 3600
- Kingston 120GB A400 SATA 3 2.5" Internal SSD SA400S37/120G (OS SSD)
- Western Digital 4TB WD Gold (x5, ZFS pool)
- Intel Optane Memory H10 32GB with SSD Solid State Storage 512GB HBRPEKNX0202AC
- Cooler Master V850 SFX Gold Full Modular, 850W
- JONSBO N1 Mini-ITX NAS Chassis
I am happy to supply other configuration/log information as needed.
- TrueNAS Version: TrueNAS-13.0-U3.1
- Previously had a SyncThing jail installed but removed, no current jails installed
Description:
The problem will usually start within a 24 hour period of a reboot.
When the problem occurs, TrueNAS becomes unresponsive. I am unable to see it on the network, the web UI does not connect, you can't reach the login page. Attaching a monitor to the physical box, I see the following cascade of error messages in the console and the console is unresponsive to the keyboard:
uhub_reattach_port: giving up port 13 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 14 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 1 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 2 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 3 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 4 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 5 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 6 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 7 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 8 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 9 reset - device vanished: change 0xfb status 0x7fb
uhub_reattach_port: giving up port 10 reset - device vanished: change 0xfb status 0x7fb
Jan 11 13:04:05 vault 1 2023-01-11T13:04:05.447511-08:00 vault.local collectd 1457 - - plugin_dispatch_values: Low water mark reached. Dropping 100% of metrics.
Please see 'console-image.jpg' attached for more context. I would supply the full text but it's not captured in the /var/log files.
- I've found similar but not identical error messaging here on the forums
- I have an intuition this is likely a hardware issue (or maybe bios setting?) based on the problems people tend to have around incompatibility with their hardware and TrueNAS, but I'm interested in a professional's opinion.
Things I've Tried
- Replacing the M.2 SATA Expansion Card, previously used this one, but I believe I was still having the issue at this time (I've been debugging this on and off for awhile)
- Moving the OS Kingston drive off the ECS07 expansion card and directly used the onboard SATA connectors
- Setup logging all console messages to try and capture the full log as PGUP/PGDOWN stops working once this flood of errors starts, logs didn't seem to have the attached picture messages
- Checking the HDD drives SMART results (They looked fine to me but, again, amateur hour here, can provide)
Resolutions
Currently the only thing that 'resolves' this issue is a hard reboot.