TrueNAS installation killed Optane drive

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
Hi everyone, first post here. I hope everyone is doing well.

I've been a long time user of TrueNAS and have quite a few production systems running. Today i tried installing TrueNAS core on a 16GB Optane drive on one of my test systems, via a USB enclosure.
Test System:
Sun X4-2L
Dual Intel E5-2660v2
128gb RAM
Lexar E300 NVME to USB enclosure

After selecting the drive to install i get the error

Code:
diskinfo: da0: ioctl(DIOCGMEDIASIZE) failed, probably not a disk.
[: -lt: unexpected operator
gmirror: No such device: swap.
dd: /dev/da0: Device not configured
1+0 records in
0+0 records out
0 bytes transferred in 0.000652 secs (0 bytes/sec)
The TrueNAS installation on da0 failed. Press enter to continue.


I would reboot and the drive is no longer recognized by the installer.
I remove the drive and install it into a win10 machine to see if win10 would recognize the drive, the drive is recognized and given a drive letter but there are no partitions nor any available space left on the drive. Diskpart doesn't recognize the drive.
I try with a Ubuntu machine, gparted doesn't see the drive, lsblk doesn't see the drive. I though to myself, unlucky must be a bad drive.

I've gone through 4 Optane drives now all with the same problem.

Now my question is, is there a reason this failed every single time? and is there a way to revive the Optane drive?
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
Did you use the same USB to NVMe adapter for each of the 4 tries?

Have you tried NVMe, (or SMART), tools to see what they say about one of the drives?

You could try and "dd" over "/dev/zero" and see if that clears the space.

Please list the specific version of TrueNAS you tried to install.
 

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
Yes, the exact same drives used directly on a motherboard or via a PCI-e adapter worked perfectly before using a USB enclosure for the install. The failure rate at this moment for using a USB enclosure for the install is 100%. I 4 different enclosures, and all have failed in exactly the same way now.

NVMe List:
Code:
Node                  SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------



lsblk does not show the NVMe drive

lspci shows 2:00.0 Non-Volatile memory controller: Intel Corporation NVMe Opatne Memory Series

smartctl is just had to guess the device was /dev/nvme0 which worked
Code:
=== START OF INFORMATION SECTION ===
Model Number:                       INTEL MEMPEK1W016GA
Serial Number:                      **REDACTED**
Firmware Version:                   K3110310
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Controller ID:                      0
NVMe Version:                       <1.2
Number of Namespaces:               0
Local Time is:                      Thu Nov 30 20:18:13 2023 HKT
Firmware Updates (0x02):            1 Slot
Optional Admin Commands (0x0006):   Format Frmw_DL
Optional NVM Commands (0x0046):     Wr_Unc DS_Mngmt Timestmp
Log Page Attributes (0x02):         Cmd_Eff_Lg
Maximum Data Transfer Size:         32 Pages

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     4.50W       -        -    0  0  0  0        0       0

=== START OF SMART DATA SECTION ===
Read NVMe SMART/Health Information failed: NVMe Status 0x4006


Googling the status 0x4006 hasn't given me any usable info yet

Installation is TrueNas Core 13.0-U5.3
 
Last edited:

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
I've installed Intel MAS Tool to see if a firmware update would help
Code:
Bootloader : JFCB0289
Capacity : 0.00 MB (512 bytes)
DevicePath : /dev/nvme0
DeviceStatus : *BAD_CONTEXT_6001
Firmware : K3110310
FirmwareUpdateAvailable : Please contact Intel Customer Support for further assistance at the following website: http://www.intel.com/go/ssdsupport.
Index : 0
ModelNumber : INTEL MEMPEK1W016GA
NamespaceId : 4294967295
PercentOverProvisioned : 100.00
ProductFamily : Intel Optane(TM) Memory Series
SMARTEnabled : True
SectorDataSize : 512
SerialNumber : **REDACTED**


DeviceStatus "BAD_CONTEXT_6001" might be a clue

Trying to update the drive results in
Code:
Checking for firmware update...

- Intel Optane(TM) Memory Series **REDACTED** -

Status : Selected drive is in a disable logical state.
 

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
After some Googling i've found what seems like a pretty definitive answer
"but the drive can be considered as completely defective"
I've tried firmware updates and a low-level format but everything fails. I'll continue tinkering on the drives in my free time, but at this moment it pretty clear the installation of TrueNAS on an Optane NVMe drive via USB WILL kill the drive. I don't know if it's an Intel thing or TrueNAS thing, hope someone could look into it.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
I don't have anything else to suggest. Except that I would also investigate the Lexar E300 NVME to USB enclosure. Though I do find it strange that it appears to only happen with TrueNAS Core installer.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
We have several users using Optane M10 for various purposes, including some who have suggested it be used as a boot device, and I use a 32GB M10 myself in a self-built machine.

I did find a reference on a bulletin board suggesting that this problem might be specific to the first-generation Optane devices:

https://forum.radxa.com/t/guide-use-intel-optane-memory-h10-with-rock5b-pcie-splitting/11598/29

The previous owner assured me the drive was working well. Which means that first generation Optane MEMPEK1W016GA can probably only be used for caching, while the second generation (Intel Optane M10) could also work as a normal storage device. I ordered a MEMPEK1J016GA to test this.

(later post from the same user)

Intel Optane Memory M10 confirmed to work - access times are excellent.

I can't say for certain that this is correct but is a potential suggestion. It would seem extremely odd that attempting to install an operating system/partition the device is sufficient to render it permanently inoperable, but if this is isolated to the specific device model with or without the USB enclosure, that bears investigation.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Yeah, no disk should self-immolate just because the host did something it did not like. I'd be very weary of that bridge you're using.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
And no drive should convert itself to an 8MB block device if power is pulled at the wrong time, but yet the Intel 320 series exists. :wink:

I'm not ruling out either the USB device or the Optane itself being at fault. If I can find a spare one, I'll sacrifice it in the name of attempting to reproduce this fault.
 

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
I don't have anything else to suggest. Except that I would also investigate the Lexar E300 NVME to USB enclosure. Though I do find it strange that it appears to only happen with TrueNAS Core installer.

I was curious to see if it's only the Lexar E300 NVMe to USB-c enclosure i was using, so i grabbed another 2 no name enclosures to test. Another 2 16gb Optanes are dead. Both enclosures do use the RTL9210 chipset. I don't know if that may be of use to anyone.

We have several users using Optane M10 for various purposes, including some who have suggested it be used as a boot device, and I use a 32GB M10 myself in a self-built machine.

I did find a reference on a bulletin board suggesting that this problem might be specific to the first-generation Optane devices:

https://forum.radxa.com/t/guide-use-intel-optane-memory-h10-with-rock5b-pcie-splitting/11598/29



I can't say for certain that this is correct but is a potential suggestion. It would seem extremely odd that attempting to install an operating system/partition the device is sufficient to render it permanently inoperable, but if this is isolated to the specific device model with or without the USB enclosure, that bears investigation.

I have (well now had), quite a few 16gb Optanes so i was willing to sacrifice a few test. My M10s although still cheap is 10x the price of my 16gb Optanes. Maybe i will try 1 next week.

And no drive should convert itself to an 8MB block device if power is pulled at the wrong time, but yet the Intel 320 series exists. :wink:

I'm not ruling out either the USB device or the Optane itself being at fault. If I can find a spare one, I'll sacrifice it in the name of attempting to reproduce this fault.

Replicated this 6 times in total now. I am a little interested to test out an M10.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
The controller is still responding. Get the nvme-cli utils and pull the logs. You probably just need to create/attach a namespace.

NVMe-CLI
 

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
The controller is still responding. Get the nvme-cli utils and pull the logs. You probably just need to create/attach a namespace.

NVMe-CLI

sudo nvme id-ns /dev/nvme0 -H
Code:
get-namespace-id: Inappropriate ioctl for device


sudo nvme smart-log /dev/nvme0 -H
Code:
NVMe status: INTERNAL: The command was not completed successfully due to an internal error(0x4006)


sudo nvme error-log /dev/nvme0
Code:
NVMe status: INTERNAL: The command was not completed successfully due to an internal error(0x4006)


sudo nvme sanitize /dev/nvme0
Code:
Invalid Sanitize Action


sudo nvme id-ctrl -H /dev/nvme0
Code:
NVME Identify Controller:
vid       : 0x8086
ssvid     : 0x8086
sn        : **REDACTED**   
mn        : INTEL MEMPEK1W016GA                     
fr        : K3110310
rab       : 0
ieee      : 5cd2e4
cmic      : 0
  [3:3] : 0     ANA not supported
  [2:2] : 0     PCI
  [1:1] : 0     Single Controller
  [0:0] : 0     Single Port

mdts      : 5
cntlid    : 0
ver       : 0
rtd3r     : 0
rtd3e     : 0
oaes      : 0
  [31:31] : 0   Discovery Log Change Notice Not Supported
  [27:27] : 0   Zone Descriptor Changed Notices Not Supported
  [15:15] : 0   Normal NSS Shutdown Event Not Supported
  [14:14] : 0   Endurance Group Event Aggregate Log Page Change Notice Not Supported
  [13:13] : 0   LBA Status Information Notices Not Supported
  [12:12] : 0   Predictable Latency Event Aggregate Log Change Notices Not Supported
  [11:11] : 0   Asymmetric Namespace Access Change Notices Not Supported
  [9:9] : 0     Firmware Activation Notices Not Supported
  [8:8] : 0     Namespace Attribute Changed Event Not Supported

ctratt    : 0
  [15:15] : 0   Extended LBA Formats Not Supported
  [14:14] : 0   Delete NVM Set Not Supported
  [13:13] : 0   Delete Endurance Group Not Supported
  [12:12] : 0   Variable Capacity Management Not Supported
  [11:11] : 0   Fixed Capacity Management Not Supported
  [10:10] : 0   Multi Domain Subsystem Not Supported
  [9:9] : 0     UUID List Not Supported
  [7:7] : 0     Namespace Granularity Not Supported
  [5:5] : 0     Predictable Latency Mode Not Supported
  [4:4] : 0     Endurance Groups Not Supported
  [3:3] : 0     Read Recovery Levels Not Supported
  [2:2] : 0     NVM Sets Not Supported
  [1:1] : 0     Non-Operational Power State Permissive Not Supported
  [0:0] : 0     128-bit Host Identifier Not Supported

rrls      : 0
cntrltype : 0
  [7:2] : 0     Reserved
  [1:0] : 0     Controller type not reported
fguid     :
crdt1     : 0
crdt2     : 0
crdt3     : 0
nvmsr     : 0
  [1:1] : 0     NVM subsystem Not part of an Enclosure
  [0:0] : 0     NVM subsystem Not part of an Storage Device

vwci      : 0
  [7:7] : 0     VPD Write Cycles Remaining field is Not valid.
  [6:0] : 0     VPD Write Cycles Remaining

mec       : 0
  [1:1] : 0     NVM subsystem Not contains a Management Endpoint on a PCIe port
  [0:0] : 0     NVM subsystem Not contains a Management Endpoint on an SMBus/I2C port

oacs      : 0x6
  [10:10] : 0   Lockdown Command and Feature Not Supported
  [9:9] : 0     Get LBA Status Capability Not Supported
  [8:8] : 0     Doorbell Buffer Config Not Supported
  [7:7] : 0     Virtualization Management Not Supported
  [6:6] : 0     NVMe-MI Send and Receive Not Supported
  [5:5] : 0     Directives Not Supported
  [4:4] : 0     Device Self-test Not Supported
  [3:3] : 0     NS Management and Attachment Not Supported
  [2:2] : 0x1   FW Commit and Download Supported
  [1:1] : 0x1   Format NVM Supported
  [0:0] : 0     Security Send and Receive Not Supported

acl       : 3
aerl      : 3
frmw      : 0x2
  [5:5] : 0     Multiple FW or Boot Update Detection Not Supported
  [4:4] : 0     Firmware Activate Without Reset Not Supported
  [3:1] : 0x1   Number of Firmware Slots
  [0:0] : 0     Firmware Slot 1 Read/Write

lpa       : 0x2
  [6:6] : 0     Telemetry Log Data Area 4 Not Supported
  [5:5] : 0     LID 0x0, Scope of each command in LID 0x5, 0x12, 0x13 Not Supported
  [4:4] : 0     Persistent Event log Not Supported
  [3:3] : 0     Telemetry host/controller initiated log page Not Supported
  [2:2] : 0     Extended data for Get Log Page Not Supported
  [1:1] : 0x1   Command Effects Log Page Supported
  [0:0] : 0     SMART/Health Log Page per NS Not Supported

elpe      : 63
npss      : 0
avscc     : 0
  [0:0] : 0     Admin Vendor Specific Commands uses Vendor Specific Format

apsta     : 0
  [0:0] : 0     Autonomous Power State Transitions Not Supported

wctemp    : 0
 [16:0] : -273 C (0 Kelvin)     Warning temperature (WCTEMP)

cctemp    : 0
 [16:0] : -273 C (0 Kelvin)     Critical temperature (CCTEMP)

mtfa      : 0
hmpre     : 0
hmmin     : 0
tnvmcap   : 0
unvmcap   : 0
rpmbs     : 0
 [31:24]: 0     Access Size
 [23:16]: 0     Total Size
  [5:3] : 0     Authentication Method
  [2:0] : 0     Number of RPMB Units

edstt     : 0
dsto      : 0
fwug      : 0
kas       : 0
hctma     : 0
  [0:0] : 0     Host Controlled Thermal Management Not Supported

mntmt     : 0
mxtmt     : 0
sanicap   : 0
  [31:30] : 0   Additional media modification after sanitize operation completes successfully is not defined
  [29:29] : 0   No-Deallocate After Sanitize bit in Sanitize command Supported
    [2:2] : 0   Overwrite Sanitize Operation Not Supported
    [1:1] : 0   Block Erase Sanitize Operation Not Supported
    [0:0] : 0   Crypto Erase Sanitize Operation Not Supported

hmminds   : 0
hmmaxd    : 0
nsetidmax : 0
endgidmax : 0
anatt     : 0
anacap    : 0
  [7:7] : 0     Non-zero group ID Not Supported
  [6:6] : 0     Group ID does not change
  [4:4] : 0     ANA Change state Not Supported
  [3:3] : 0     ANA Persistent Loss state Not Supported
  [2:2] : 0     ANA Inaccessible state Not Supported
  [1:1] : 0     ANA Non-optimized state Not Supported
  [0:0] : 0     ANA Optimized state Not Supported

anagrpmax : 0
nanagrpid : 0
pels      : 0
domainid  : 0
megcap    : 0
sqes      : 0x66
  [7:4] : 0x6   Max SQ Entry Size (64)
  [3:0] : 0x6   Min SQ Entry Size (64)

cqes      : 0x44
  [7:4] : 0x4   Max CQ Entry Size (16)
  [3:0] : 0x4   Min CQ Entry Size (16)

maxcmd    : 0
nn        : 0
oncs      : 0x46
  [8:8] : 0     Copy Not Supported
  [7:7] : 0     Verify Not Supported
  [6:6] : 0x1   Timestamp Supported
  [5:5] : 0     Reservations Not Supported
  [4:4] : 0     Save and Select Not Supported
  [3:3] : 0     Write Zeroes Not Supported
  [2:2] : 0x1   Data Set Management Supported
  [1:1] : 0x1   Write Uncorrectable Supported
  [0:0] : 0     Compare Not Supported

fuses     : 0
  [0:0] : 0     Fused Compare and Write Not Supported

fna       : 0x4
  [3:3] : 0     FormatNVM Broadcast NSID (FFFFFFFFh) Supported
  [2:2] : 0x1   Crypto Erase Supported as part of Secure Erase
  [1:1] : 0     Crypto Erase Applies to Single Namespace(s)
  [0:0] : 0     Format Applies to Single Namespace(s)

vwc       : 0
  [2:1] : 0     Support for the NSID field set to FFFFFFFFh is not indicated
  [0:0] : 0     Volatile Write Cache Not Present

awun      : 0
awupf     : 0
icsvscc     : 0
  [0:0] : 0     NVM Vendor Specific Commands uses Vendor Specific Format

nwpc      : 0
  [2:2] : 0     Permanent Write Protect Not Supported
  [1:1] : 0     Write Protect Until Power Supply Not Supported
  [0:0] : 0     No Write Protect and Write Protect Namespace Not Supported

acwu      : 0
ocfs      : 0
  [1:1] : 0     Controller Copy Format 1h Not Supported
  [0:0] : 0     Controller Copy Format 0h Not Supported

sgls      : 0
 [15:8] : 0     SGL Descriptor Threshold
 [1:0]  : 0     Scatter-Gather Lists Not Supported

mnan      : 0
maxdna    : 0
maxcna    : 0
subnqn    :
ioccsz    : 0
iorcsz    : 0
icdoff    : 0
fcatt     : 0
  [0:0] : 0     Dynamic Controller Model

msdbd     : 0
ofcs      : 0
  [0:0] : 0     Disconnect command Not Supported

ps    0 : mp:4.50W operational enlat:0 exlat:0 rrt:0 rrl:0
          rwt:0 rwl:0 idle_power:- active_power:-


sudo nvme sanitize-log -H /dev/nvme0
Code:
NVMe status: INVALID_NS: The namespace or the format of that namespace is invalid(0x400b)


I can't seem to get nvme-cli to do anything useful.

Sorry if i'm adding too many commands in the post, just hoping it may help the next person that comes along with a similar problem
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
I'm using a 16 GB M10 Optane as boot drive without issues.
 

hixie

Cadet
Joined
Nov 29, 2023
Messages
7
I'm using a 16 GB M10 Optane as boot drive without issues.
These were Gen1 16GB Optane Memory, i do have a couple of M10s i'm debating if i want to potentially sacrifice them for testing.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
I would definitely kick that USB enclosure to the curb.
 

1326

Cadet
Joined
Dec 13, 2023
Messages
4
Just registered to share my data point, I too got a dead M10 16GB inside a RTL9210 enclosure (ORICO M.2 NVMe SSD Enclosure (B08G14NBCS) ) while running a sqlite database on it.

I moved it out from the enclosure and to my PC's m.2 slot, it was sometime recongnized by the bios, sometimes not.

I bought this M10 back in Sept 2019, and was using it as cache of my game drive in my main PC until few months ago I got a 4TB SSD. So this M10 worked flawlessly for more than 4 years as cache, but dead few hours after working inside the enclousure.

Following are the dmesg messages I got from plugging it in to it dies:

Code:
[304834.818732] usb 4-1: new SuperSpeed USB device number 2 using xhci_hcd
[304834.846748] usb 4-1: New USB device found, idVendor=0bda, idProduct=9210, bcdDevice=20.01
[304834.846752] usb 4-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[304834.846753] usb 4-1: Product: RTL9210
[304834.846754] usb 4-1: Manufacturer: Realtek
[304834.846755] usb 4-1: SerialNumber: 012345678904
[304834.849146] usb 4-1: Enable of device-initiated U1 failed.
[304834.849395] usb 4-1: Enable of device-initiated U2 failed.
[304834.863170] usbcore: registered new interface driver usb-storage
[304834.873534] usb 4-1: Enable of device-initiated U1 failed.
[304834.873843] usb 4-1: Enable of device-initiated U2 failed.
[304834.875337] scsi host7: uas
[304834.875417] usbcore: registered new interface driver uas
[304834.878973] scsi 7:0:0:0: Direct-Access     Realtek  RTL9210 NVME     1.00 PQ: 0 ANSI: 6
[304834.904185] sd 7:0:0:0: Attached scsi generic sg13 type 0
[304834.909339] sd 7:0:0:0: [sdn] 28131328 512-byte logical blocks: (14.4 GB/13.4 GiB)
[304834.911012] sd 7:0:0:0: [sdn] Write Protect is off
[304834.911026] sd 7:0:0:0: [sdn] Mode Sense: 37 00 00 08
[304834.914394] sd 7:0:0:0: [sdn] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[304834.915572] sd 7:0:0:0: [sdn] Preferred minimum I/O size 512 bytes
[304834.915576] sd 7:0:0:0: [sdn] Optimal transfer size 33553920 bytes
[304834.928189] sd 7:0:0:0: [sdn] Attached SCSI disk
[305490.896299] EXT4-fs (sdn): mounted filesystem 7bba2db9-6722-4cb7-b652-4c57d5bcb0b7 with ordered data mode. Quota mode: none.
[319764.350271] sd 7:0:0:0: [sdn] tag#19 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=6s
[319764.350276] sd 7:0:0:0: [sdn] tag#19 Sense Key : Medium Error [current]
[319764.350277] sd 7:0:0:0: [sdn] tag#19 Add. Sense: Unrecovered read error
[319764.350279] sd 7:0:0:0: [sdn] tag#19 CDB: Write(10) 2a 00 00 70 c5 a8 00 00 08 00
[319764.350281] critical medium error, dev sdn, sector 7390632 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 2
[319764.350285] EXT4-fs warning (device sdn): ext4_end_bio:343: I/O error 7 writing to inode 13 starting block 923829)
[319764.350288] Buffer I/O error on device sdn, logical block 923829
[319789.453895] sd 7:0:0:0: [sdn] tag#21 uas_eh_abort_handler 0 uas-tag 6 inflight: CMD OUT
[319789.453902] sd 7:0:0:0: [sdn] tag#21 CDB: Write(10) 2a 00 00 70 c5 d0 00 00 18 00
[319789.453955] sd 7:0:0:0: [sdn] tag#20 uas_eh_abort_handler 0 uas-tag 5 inflight: CMD OUT
[319789.453958] sd 7:0:0:0: [sdn] tag#20 CDB: Write(10) 2a 00 00 70 c5 b8 00 00 08 00
[319789.453990] sd 7:0:0:0: [sdn] tag#16 uas_eh_abort_handler 0 uas-tag 8 inflight: CMD IN
[319789.453993] sd 7:0:0:0: [sdn] tag#16 CDB: Read(10) 28 00 00 7e 6f 18 00 00 08 00
[319789.454033] sd 7:0:0:0: [sdn] tag#15 uas_eh_abort_handler 0 uas-tag 7 inflight: CMD IN
[319789.454035] sd 7:0:0:0: [sdn] tag#15 CDB: Read(10) 28 00 00 7e 6e d8 00 00 08 00
[319789.454070] sd 7:0:0:0: [sdn] tag#14 uas_eh_abort_handler 0 uas-tag 3 inflight: CMD IN
[319789.454072] sd 7:0:0:0: [sdn] tag#14 CDB: Read(10) 28 00 00 7e 6e c0 00 00 08 00
[319789.454109] sd 7:0:0:0: [sdn] tag#13 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[319789.454111] sd 7:0:0:0: [sdn] tag#13 CDB: Read(10) 28 00 00 7e 6e 98 00 00 08 00
[319789.454148] sd 7:0:0:0: [sdn] tag#12 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD IN
[319789.454150] sd 7:0:0:0: [sdn] tag#12 CDB: Read(10) 28 00 00 7e 6e b0 00 00 08 00
[319789.473907] scsi host7: uas_eh_device_reset_handler start
[319789.602401] usb 4-1: reset SuperSpeed USB device number 2 using xhci_hcd
[319790.047744] usb 4-1: Enable of device-initiated U1 failed.
[319790.048028] usb 4-1: Enable of device-initiated U2 failed.
 

rvassar

Guru
Joined
May 2, 2018
Messages
972
Failure is on write. You may be at the media wear limit. Can you pull any diags when it recognizes it? Realtek bridge might not allow it.
 

1326

Cadet
Joined
Dec 13, 2023
Messages
4
I made some attempts to connect to the drive. Strangely enough, if it was detected by the BIOS, Ubuntu won't detect it. However, if it was NOT detected by the BIOS, Ubuntu will at least detect it as a PCIe device, and Win10 also detects it (but still not working right).

I turned M.2/Optane Genie on and it automatically turn on RAID mode, after reboot, Optane drive was detected by the BIOS. But Intel RST saw no disks. And Ubuntu didn't detect it.

IMG_8921.jpeg IMG_8923.jpeg

After I turned that Genie off and change it back to AHCI mode, BIOS no longer saw it, but Ubuntu did.

IMG_8925.jpeg

Code:
# Relevant dmesg when it was NOT detected by the BIOS
[    0.297789] pci 0000:02:00.0: [8086:2522] type 00 class 0x010802
[    0.297811] pci 0000:02:00.0: reg 0x10: [mem 0xa3210000-0xa3213fff 64bit]
[    0.297827] pci 0000:02:00.0: reg 0x20: [mem 0xa3200000-0xa320ffff 64bit]
[    0.297837] pci 0000:02:00.0: enabling Extended Tags
...
[    5.184316] nvme nvme0: pci function 0000:02:00.0
[    5.184327] nvme 0000:02:00.0: enabling device (0000 -> 0002)
...
[    7.775553] nvme nvme0: Device not ready; aborting initialisation
[    7.775570] nvme nvme0: Removing after probe failure status: -19

# lspci when it was NOT detected by the BIOS
root@ubuntu:/home/ubuntu# lspci
00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers (rev 08)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 08)
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port 9 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Device a305 (rev 10)
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller [0c80]: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
01:00.0 VGA compatible controller: NVIDIA Corporation Device 2504 (rev a1)
01:00.1 Audio device: NVIDIA Corporation Device 228e (rev a1)
02:00.0 Non-Volatile memory controller: Intel Corporation Device 2522

root@ubuntu:/home/ubuntu# lspci -s 02:00.0 -vvv
02:00.0 Non-Volatile memory controller: Intel Corporation Device 2522 (prog-if 02 [NVM Express])
    Subsystem: Intel Corporation Device 3802
    Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Interrupt: pin A routed to IRQ 16
    NUMA node: 0
    Region 0: Memory at a3210000 (64-bit, non-prefetchable) [size=16K]
    Region 4: Memory at a3200000 (64-bit, non-prefetchable) [size=64K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [50] MSI-X: Enable- Count=9 Masked-
        Vector table: BAR=0 offset=00002000
        PBA: BAR=0 offset=00003000
    Capabilities: [60] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 8GT/s, Width x2, ASPM L1, Exit Latency L0s unlimited, L1 unlimited
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- CommClk+
            ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 8GT/s, Width x2, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
             EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
    Capabilities: [a0] MSI: Enable- Count=1/16 Maskable+ 64bit+
        Address: 0000000000000000  Data: 0000
        Masking: 00000000  Pending: 00000000
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
        AERCap:    First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
    Capabilities: [150 v1] Virtual Channel
        Caps:    LPEVC=0 RefClk=100ns PATEntryBits=1
        Arb:    Fixed- WRR32- WRR64- WRR128-
        Ctrl:    ArbSelect=Fixed
        Status:    InProgress-
        VC0:    Caps:    PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
            Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
            Ctrl:    Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
            Status:    NegoPending- InProgress-
    Capabilities: [180 v1] Power Budgeting <?>
    Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
        ARICap:    MFVC- ACS-, Next Function: 0
        ARICtl:    MFVC- ACS-, Function Group: 0
    Capabilities: [2a0 v1] #19
    Capabilities: [2d0 v1] Latency Tolerance Reporting
        Max snoop latency: 3145728ns
        Max no snoop latency: 3145728ns
    Capabilities: [310 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
              PortCommonModeRestoreTime=100us PortTPowerOnTime=3100us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
               T_CommonMode=0us LTR1.2_Threshold=81920ns
        L1SubCtl2: T_PwrOn=3100us
    Kernel modules: nvme

# But Intel MAS did not detect it
root@ubuntu:/home/ubuntu# intelmas show -intelssd

No results



There was no `/dev/nvme*`, so `smartctl` didn't work.

I installed my Win10 drive in the other M.2 slot with the setting that Optane drive was NOT detected by the BIOS,

Intel MAS still does not see it, only have my system dirve in the pulldown list
Screenshot 2023-12-14 214328.png


Intel(R) Rapid Storage Technology also didn't detect it, but I forgot to grab a screenshot.

But Device Manager saw it,

Screenshot 2023-12-14 213630.png

Screenshot 2023-12-14 213852.png Screenshot 2023-12-14 213905.png

Maybe it IS dead?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Just a few comments...

CORE (FreeBSD) uses the command
Code:
nvmecontrol
where SCALE (Debian) uses the command
Code:
nvme
. While they do the basic same thing, the command line structure is completely different.

I have just woke up and rolling into work so I have not done any investigation but the questions I'd ask are:
1) Is the M.2 module PCIe or SATA?
2) Have you tried a USB 2.0 connection?

My initial thoughts are that the module is dead or the interface is in question when the BIOS does not recognize it, hence the USB 2.0 question.
Any SMART version below 7.4 has very minimal support for NVMe, however the correct nvmecontrol/nvme command can do the same thing, but the BIOS needs to recognize it.

Maybe someone can come up with something more to look at and I will revisit this thread and maybe I can think of something.

And, we have two people with a similar problem BUT! We need to be careful here to make sure the advice given is properly directed.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
smartctl may want address the device as /dev/nvd0 rathare than /dev/nvme0. But the inconsistent detection does not bode well.
 
Last edited:
Top