russellkwr
Cadet
- Joined
- Mar 5, 2024
- Messages
- 1
I am getting a strange error realted to the "disktemp.py" that is resulting in a "pool 'boot-pool' has encountered an uncorrectable I/O failure and has been suspended". At this point the error has caused the server to lock up. When to a forced restart of my server, all boots up normally and it as if there are no issues.
All hardware is brand new. I don't suspect this is a hardware issue. Boot pool is a single Intel Optane SSD.
I have seen threads on this being SNMP related. I do not run SNMP and the service is disabled.
CPU usage is minimal - almost always close to 0%.
I have one storage pool with SMP access. I have one ubuntu VM running PiHole.
Any help is appreciated.
Thanks!
Mar 5 06:18:11 maxwell 1 2024-03-05T06:18:07.755119-08:00 maxwell.local collectd 2085 - - Traceback (most recent call last):
File "/usr/local/lib/collectd_pyplugins/disktemp.py", line 62, in read
with Client() as c:
File "/usr/local/lib/python3.9/site-packages/middlewared/client/client.py", line 286, in __init__
self._ws.connect()
File "/usr/local/lib/python3.9/site-packages/middlewared/client/client.py", line 124, in connect
rv = super(WSClient, self).connect()
File "/usr/local/lib/python3.9/site-packages/ws4py/client/__init__.py", line 223, in connect
bytes = self.sock.recv(128)
socket.timeout: timed out
Mar 5 06:19:11 maxwell ahcich4: Timeout on slot 20 port 0
Mar 5 06:19:11 maxwell ahcich4: is 00000000 cs 00300000 ss 00000000 rs 00300000 tfd c0 serr 00000000 cmd 0004d417
Mar 5 06:19:11 maxwell (ada0:ahcich4:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 0040 00 00 00 00 00 00
Mar 5 06:19:11 maxwell (ada0:ahcich4:0:0:0): CAM status: Command timeout
Mar 5 06:19:11 maxwell (ada0:ahcich4:0:0:0): Retrying command, 0 more tries remain
Mar 5 06:19:11 maxwell xhci0: Resetting controller
Mar 5 06:19:11 maxwell ahcich5: Timeout on slot 9 port 0
Mar 5 06:19:11 maxwell ahcich5: is 00000000 cs 00000600 ss 00000000 rs 00000600 tfd c0 serr 00000000 cmd 0004c917
Mar 5 06:19:11 maxwell (ada1:ahcich5:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 0040 00 00 00 00 00 00
Mar 5 06:19:11 maxwell (ada1:ahcich5:0:0:0): CAM status: Command timeout
Mar 5 06:19:11 maxwell (ada1:ahcich5:0:0:0): Retrying command, 0 more tries remain
Mar 5 06:20:18 maxwell uhub0: at usbus0, port 1, addr 1 (disconnected)
Mar 5 06:20:18 maxwell ugen0.2: <vendor 0x05e3 USB2.0 Hub> at usbus0 (disconnected)
Mar 5 06:20:18 maxwell uhub1: at uhub0, port 9, addr 1 (disconnected)
Mar 5 06:20:18 maxwell uhub1: detached
Mar 5 06:20:18 maxwell uhub0: detached
Mar 5 06:20:18 maxwell uhub0 on usbus0
Mar 5 06:20:18 maxwell uhub0: <Intel XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
Mar 5 06:21:13 maxwell ahcich6: Timeout on slot 1 port 0
Mar 5 06:21:13 maxwell ahcich6: is 00000000 cs 00000000 ss 00000006 rs 00000006 tfd 40
All hardware is brand new. I don't suspect this is a hardware issue. Boot pool is a single Intel Optane SSD.
I have seen threads on this being SNMP related. I do not run SNMP and the service is disabled.
CPU usage is minimal - almost always close to 0%.
I have one storage pool with SMP access. I have one ubuntu VM running PiHole.
Any help is appreciated.
Thanks!
Mar 5 06:18:11 maxwell 1 2024-03-05T06:18:07.755119-08:00 maxwell.local collectd 2085 - - Traceback (most recent call last):
File "/usr/local/lib/collectd_pyplugins/disktemp.py", line 62, in read
with Client() as c:
File "/usr/local/lib/python3.9/site-packages/middlewared/client/client.py", line 286, in __init__
self._ws.connect()
File "/usr/local/lib/python3.9/site-packages/middlewared/client/client.py", line 124, in connect
rv = super(WSClient, self).connect()
File "/usr/local/lib/python3.9/site-packages/ws4py/client/__init__.py", line 223, in connect
bytes = self.sock.recv(128)
socket.timeout: timed out
Mar 5 06:19:11 maxwell ahcich4: Timeout on slot 20 port 0
Mar 5 06:19:11 maxwell ahcich4: is 00000000 cs 00300000 ss 00000000 rs 00300000 tfd c0 serr 00000000 cmd 0004d417
Mar 5 06:19:11 maxwell (ada0:ahcich4:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 0040 00 00 00 00 00 00
Mar 5 06:19:11 maxwell (ada0:ahcich4:0:0:0): CAM status: Command timeout
Mar 5 06:19:11 maxwell (ada0:ahcich4:0:0:0): Retrying command, 0 more tries remain
Mar 5 06:19:11 maxwell xhci0: Resetting controller
Mar 5 06:19:11 maxwell ahcich5: Timeout on slot 9 port 0
Mar 5 06:19:11 maxwell ahcich5: is 00000000 cs 00000600 ss 00000000 rs 00000600 tfd c0 serr 00000000 cmd 0004c917
Mar 5 06:19:11 maxwell (ada1:ahcich5:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 0040 00 00 00 00 00 00
Mar 5 06:19:11 maxwell (ada1:ahcich5:0:0:0): CAM status: Command timeout
Mar 5 06:19:11 maxwell (ada1:ahcich5:0:0:0): Retrying command, 0 more tries remain
Mar 5 06:20:18 maxwell uhub0: at usbus0, port 1, addr 1 (disconnected)
Mar 5 06:20:18 maxwell ugen0.2: <vendor 0x05e3 USB2.0 Hub> at usbus0 (disconnected)
Mar 5 06:20:18 maxwell uhub1: at uhub0, port 9, addr 1 (disconnected)
Mar 5 06:20:18 maxwell uhub1: detached
Mar 5 06:20:18 maxwell uhub0: detached
Mar 5 06:20:18 maxwell uhub0 on usbus0
Mar 5 06:20:18 maxwell uhub0: <Intel XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
Mar 5 06:21:13 maxwell ahcich6: Timeout on slot 1 port 0
Mar 5 06:21:13 maxwell ahcich6: is 00000000 cs 00000000 ss 00000006 rs 00000006 tfd 40