NVME failing

rofe

Cadet
Joined
Aug 1, 2022
Messages
5
Hi,

I've made a clean install of TrueNAS Core 13.

I have two NVME drives in the server. The first one seems to work without problems - it's the system drive.

The second fails during boot or after some time.

When the server boots without error I can see the drive in the admin panel. If it fails during boot or after some time, the drive disappears.

I get this error in dmesg:

nvme1: RECOVERY_START 143132164413 vs 142608551277
nvme1: Controller in fatal status, resetting
nvme1: Resetting controller due to a timeout and possible hot unplug.
nvme1: RECOVERY_WAITING
nvme1: resetting controller
nvme1: failing outstanding i/o
nvme1: READ sqid:2 cid:127 nsid:1 lba:32 len:224
nvme1: ABORTED - BY REQUEST (00/07) sqid:2 cid:127 cdw0:0
nvme1: failing outstanding i/o
nvme1: READ sqid:2 cid:126 nsid:1 lba:544 len:224
nvme1: ABORTED - BY REQUEST (00/07) sqid:2 cid:126 cdw0:0
nvme1: failing outstanding i/o
nvme1: READ sqid:2 cid:125 nsid:1 lba:500117024 len:224

I don't know how to move on from this - any advice?

Regards
Ronni
 

Benni.blanko

Dabbler
Joined
Dec 18, 2021
Messages
31
I have a very similar problem with two NVMe drives as boot-pool mirror.
Either one or the other drive failed after between 9 to 40 days.
Check out this thread here in the forum:
https://www.truenas.com/community/t...pool-controller-resets-due-to-timeouts.101519
My last thing to try out was to set a tunable hw.nvme.per_cpu_io_queues=0
But this my hurt your drives performance (no problem on a boot devices, but might hurt on a data pool).

14 days uptime so far ...

Regards
Benni
 

rofe

Cadet
Joined
Aug 1, 2022
Messages
5
Hi Benni,

I will go through the thread and see if it can help - I have not done anything yet since I'm new to both TrueNAS and FreeBSD.

Thank you for helping out.

Regards
Ronni
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Since you are just starting with TrueNAS, I'm curious if you would try TrueNAS Scale, it's written on the back of Debian (Linux) vice FreeBSD (BSD). This might eliminate the issue without any tweaking. Again, I'm just offering a possible solution, for all I know the problem will still exist. Also, do a clean install if you decide to try Scale out.
 

rofe

Cadet
Joined
Aug 1, 2022
Messages
5
I use TrueNAS Core (FreeBSD). I have some experience with Linux, and not so good experience with Debian, mostly using Fedora.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
I use TrueNAS Core (FreeBSD). I have some experience with Linux, and not so good experience with Debian, mostly using Fedora.
If you are not programming then the operation of Scale should not matter. But it was only a suggestion.
 
Top