DenisInternet
Dabbler
- Joined
- Jun 14, 2022
- Messages
- 28
Hey folks, have a failed NVME drive, waiting for the replacement drive to arrive, currently pool is running degraded (slow but seems stable). Everything is backed up on a secondary raid box.
While I am waiting for my replacement NVME to arrive (lesson learned about keeping a spare in the future). I was trying to figure out what went wrong with the drive, while running smartctl -a /dev/nvme3n1 ; could someone please help me understand the information below (if any is insightful to what is the failure)?
Thanks!
While I am waiting for my replacement NVME to arrive (lesson learned about keeping a spare in the future). I was trying to figure out what went wrong with the drive, while running smartctl -a /dev/nvme3n1 ; could someone please help me understand the information below (if any is insightful to what is the failure)?
Thanks!
Code:
admin@truenas[~]$ sudo smartctl -a /dev/nvme3n1 smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.1.74-production+truenas] (local build) Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Number: WUS3BA176C7P3E3 Serial Number: A068DE2E Firmware Version: R0112100 PCI Vendor/Subsystem ID: 0x1b96 IEEE OUI Identifier: 0x0014ee Total NVM Capacity: 7,681,501,126,656 [7.68 TB] Unallocated NVM Capacity: 0 Controller ID: 0 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 7,681,501,126,656 [7.68 TB] Namespace 1 Formatted LBA Size: 4096 Namespace 1 IEEE EUI-64: 0014ee 81000aee24 Local Time is: Thu Mar 7 02:31:05 2024 UTC Firmware Updates (0x19): 4 Slots, Slot 1 R/O, no Reset required Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test Optional NVM Commands (0x005a): Wr_Unc Wr_Zero Sav/Sel_Feat Timestmp Log Page Attributes (0x03): S/H_per_NS Cmd_Eff_Lg Warning Comp. Temp. Threshold: 70 Celsius Critical Comp. Temp. Threshold: 80 Celsius Namespace 1 Features (0x02): NA_Fields Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 12.00W - - 0 0 0 0 0 0 1 + 10.00W - - 0 0 0 0 0 0 2 + 8.00W - - 0 0 0 0 0 0 Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 4096 0 0 1 - 512 0 0 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! - media has been placed in read only mode - volatile memory backup device has failed SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x18 Temperature: 36 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 0% Data Units Read: 136,159,627 [69.7 TB] Data Units Written: 78,984,419 [40.4 TB] Host Read Commands: 738,510,235 Host Write Commands: 549,669,094 Controller Busy Time: 105,931 Power Cycles: 3,813 Power On Hours: 8,793 Unsafe Shutdowns: 221 Media and Data Integrity Errors: 0 Error Information Log Entries: 28 Warning Comp. Temperature Time: 39399 Critical Comp. Temperature Time: 57 Temperature Sensor 1: 39 Celsius Error Information (NVMe Log 0x01, 16 of 256 entries) Num ErrCount SQId CmdId Status PELoc LBA NSID VS Message 0 28 0 0x201c 0xc004 0x029 0 0 - Invalid Field in Command Self-test Log (NVMe Log 0x06) Self-test status: No self-test in progress Num Test_Description Status Power_on_Hours Failing_LBA NSID Seg SCT Code 0 Extended Completed without error 6572 - - - - - admin@truenas[~]$ print admin@truenas[~]$