NVME disk - Error "number of Error Log entries increased"

oregonspain

Cadet
Joined
Feb 23, 2022
Messages
4
Hello.


I have a Dell server on which I have a Truenas installed:

FreeBSD 12.2-RELEASE-p9 2ee62d665f0(HEAD) TRUENAS

We have a pool of NVME disks with 3 units.

Device: /dev/nvme0, number of Error Log entries increased from 1 to 2​

Device: /dev/nvme2, number of Error Log entries increased from 3 to 4​

 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Okay. And...?

Your flash devices experienced errors. Is there a specific question here? Do you panic? No. Do you inspect the SMART stats? Yes. Do you post detailed hardware information, per the Forum Rules, so that you're more likely to get a useful answer? Highly recommended.
 

SirNomad49

Cadet
Joined
Jul 10, 2021
Messages
9
EDIT: added additional sources

Hello,
I just installed a new NVME, which also reports back the given errors:



SMART State:

Code:
root@truenas[/var/log]# smartctl -a /dev/nvme0
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.10.109+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       CT250P2SSD8
Serial Number:                      2212E61EDB67
Firmware Version:                   P2CR048
PCI Vendor/Subsystem ID:            0xc0a9
IEEE OUI Identifier:                0x00a075
Total NVM Capacity:                 250,059,350,016 [250 GB]
Unallocated NVM Capacity:           0
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            00a075 6000000143
Local Time is:                      Mon Jun  6 17:11:59 2022 CEST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x0e):         Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.50W       -        -    0  0  0  0        0       0
 1 +     1.90W       -        -    1  1  1  1        0       0
 2 +     1.50W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3     5000    1900
 4 -   0.0020W       -        -    4  4  4  4    13000  100000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1
 1 -    4096       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        34 Celsius
Available Spare:                    100%
Available Spare Threshold:          5%
Percentage Used:                    0%
Data Units Read:                    22 [11.2 MB]
Data Units Written:                 152 [77.8 MB]
Host Read Commands:                 307
Host Write Commands:                792
Controller Busy Time:               0
Power Cycles:                       1
Power On Hours:                     1
Unsafe Shutdowns:                   0
Media and Data Integrity Errors:    0
Error Information Log Entries:      132
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               47 Celsius

Error Information (NVMe Log 0x01, 16 of 16 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0        132     0  0xd007  0x4005      -            0     0     -
  1        131     0  0xd006  0x4005      -            0     0     -
  2        130     0  0x000a  0x4005      -            0     0     -
  3        129     0  0xb00d  0x4005      -            0     0     -
  4        128     0  0xf00a  0x4004      -            0     0     -
  5        127     0  0xf009  0x4004      -            0     0     -
  6        126     0  0xa00c  0x4004      -            0     0     -
  7        125     0  0x900f  0x4004      -            0     0     -
  8        124     0  0xa006  0x4004      -            0     0     -
  9        123     0  0xa005  0x4004      -            0     0     -
 10        122     0  0xe017  0x4004      -            0     0     -
 11        121     0  0xd00b  0x4004      -            0     0     -
 12        120     0  0x800d  0x4005      -            0     0     -
 13        119     0  0xe014  0x4005      -            0     0     -
 14        118     0  0xc00b  0x4005      -            0     0     -
 15        117     0  0xc00a  0x4005      -            0     0     -





nvme error-log:

It looks like the same as:

https://github.com/linux-nvme/nvme-cli/issues/1130 & https://github.com/linux-nvme/nvme-cli/issues/411
https://www.smartmontools.org/ticket/1222#no1 & https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900244 & https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900244 & https://bugs.launchpad.net/ubuntu/+source/smartmontools/+bug/1878264


Code:
root@truenas[/var/log]# nvme error-log /dev/nvme0
Error Log Entries for device:nvme0 entries:16
.................
 Entry[ 0]
.................
error_count     : 136
sqid            : 0
cmdid           : 0xc002
status_field    : 0x4004(INVALID_FIELD: A reserved coded value or an unsupported value in a defined field)
parm_err_loc    : 0xffff
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 1]
.................
error_count     : 135
sqid            : 0
cmdid           : 0xc00d
status_field    : 0x4004(INVALID_FIELD: A reserved coded value or an unsupported value in a defined field)
parm_err_loc    : 0xffff
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 2]
.................
error_count     : 134
sqid            : 0
cmdid           : 0xf006
status_field    : 0x4005(INVALID_FIELD: A reserved coded value or an unsupported value in a defined field)
parm_err_loc    : 0xffff
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 3]
.................
error_count     : 133
sqid            : 0
cmdid           : 0xf005
status_field    : 0x4005(INVALID_FIELD: A reserved coded value or an unsupported value in a defined field)
parm_err_loc    : 0xffff
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 4]
.................



root@truenas[/var/log]# nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 2212E61EDB67 CT250P2SSD8 1 250.06 GB / 250.06 GB 512 B + 0 B P2CR048
/dev/nvme1n1 50026B7684C76DB6 KINGSTON SA2000M8250G 1 250.05 GB / 250.06 GB 512 B + 0 B S5Z42105
 
Last edited:
Joined
Oct 22, 2019
Messages
3,641
 
Top