donkmeister
Cadet
- Joined
- Jun 22, 2023
- Messages
- 7
I'm hoping for guidance on how to check I have actually fixed a problem, rather than just been lucky enough that the problem hasn't bitten me since the last reboot.
The web interface for my TrueNAS-SCALE-22.12.3.2 box stopped responding today whilst I was out and about... When I got home I hooked up a monitor and the console reported that the issue was the boot-pool, which the system then deactivated. I can't recall the precise wording I'm afraid.
I rebooted, and kept an eye on the console... Sure enough after 10-15 minutes of faultless operation I started to see timeouts for the boot drive as follows... NVME1 is my boot drive.
nvme nvme1: I/O 105 QID 4 timeout, aborting
nvme nvme1: I/O 106 QID 4 timeout, aborting
nvme nvme1: I/O 47 QID 1 timeout, aborting
nvme nvme1: I/O 114 QID 1 timeout, aborting
nvme nvme1: I/O 115 QID 1 timeout, aborting
nvme nvme1: I/O 105 QID 4 timeout, reset controller
nvme nvme1: I/O 10 QID 0 timeout, reset controller
Then the console stopped responding so I did a hard power-down.
The Chelsio NIC is a used part. I bought three from a used equipment seller and one was DOA so I wondered if the one installed in the TN box was starting to fail. I powered up, set TN to use the motherboard's onboard ethernet, shut down and removed the NIC. I've now had the TN box running for a little while without the used NIC and it hasn't repeated the failure.
Any guidance appreciated, thank you.
The web interface for my TrueNAS-SCALE-22.12.3.2 box stopped responding today whilst I was out and about... When I got home I hooked up a monitor and the console reported that the issue was the boot-pool, which the system then deactivated. I can't recall the precise wording I'm afraid.
I rebooted, and kept an eye on the console... Sure enough after 10-15 minutes of faultless operation I started to see timeouts for the boot drive as follows... NVME1 is my boot drive.
nvme nvme1: I/O 105 QID 4 timeout, aborting
nvme nvme1: I/O 106 QID 4 timeout, aborting
nvme nvme1: I/O 47 QID 1 timeout, aborting
nvme nvme1: I/O 114 QID 1 timeout, aborting
nvme nvme1: I/O 115 QID 1 timeout, aborting
nvme nvme1: I/O 105 QID 4 timeout, reset controller
nvme nvme1: I/O 10 QID 0 timeout, reset controller
Then the console stopped responding so I did a hard power-down.
The Chelsio NIC is a used part. I bought three from a used equipment seller and one was DOA so I wondered if the one installed in the TN box was starting to fail. I powered up, set TN to use the motherboard's onboard ethernet, shut down and removed the NIC. I've now had the TN box running for a little while without the used NIC and it hasn't repeated the failure.
Any guidance appreciated, thank you.