Hi all!
I am going crazy with a new clean TrueNAS deployment.
CPU AMD 5950x
Mainboard x570-f
Ram: 3200 mhz ECC (the mainboard supports ECC ram)
1 x 3.2 TB Samsung 1725b nvme u.2
1 ssd disk used as boot
I have a single volume with the Samsung 1725b.
The system can't stay online more than 24h. The log shows:
I suspect some issues with some Samsung controllers:
access.redhat.com
The same disk works without issues with clear-linux.
The system is almost empty, and has just a few dockers to test the stability (shinobi CCTV, uptime kuma, traefik).
I am going crazy with a new clean TrueNAS deployment.
CPU AMD 5950x
Mainboard x570-f
Ram: 3200 mhz ECC (the mainboard supports ECC ram)
1 x 3.2 TB Samsung 1725b nvme u.2
1 ssd disk used as boot
I have a single volume with the Samsung 1725b.
The system can't stay online more than 24h. The log shows:
Mar 21 00:04:31 truenas kernel: nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10
Mar 21 00:04:31 truenas kernel: block nvme0n1: no usable path - requeuing I/O
Mar 21 00:04:31 truenas kernel: block nvme0n1: no usable path - requeuing I/O
Mar 21 00:04:33 truenas kernel: nvme nvme0: Shutdown timeout set to 10 seconds
Mar 21 00:04:34 truenas kernel: nvme nvme0: 32/0/0 default/read/poll queues
Mar 21 00:04:42 truenas kernel: block nvme0n1: no usable path - requeuing I/O
Mar 21 00:05:04 truenas kernel: block nvme0n1: no usable path - requeuing I/O
Mar 21 00:05:06 truenas kernel: nvme nvme0: I/O 917 QID 11 timeout, disable controller
Mar 21 00:05:06 truenas kernel: nvme nvme0: failed to mark controller live state
Mar 21 00:05:06 truenas kernel: nvme nvme0: Removing after probe failure status: -19
I suspect some issues with some Samsung controllers:

NVMe controller reset and getting IO errors in RHEL8 - Red Hat Customer Portal
Below messages are logged at the time of issue kernel: nvme nvme1: I/O 423 QID 29 timeout, reset controller kernel: nvme nvme1: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10 kernel: nvme nvme1: Device not ready; aborting reset, CSTS=0x3 kernel: blk_update_request: I/O error, dev...
The same disk works without issues with clear-linux.
The system is almost empty, and has just a few dockers to test the stability (shinobi CCTV, uptime kuma, traefik).