I think my boot SSD is dying

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
WTF. This ain't right. It's an SSD! It's not allowed to die! I'll run some smaaahts and see what they say. A power-cycle got it back, for now.

1594731542338.png
 
Joined
Jul 2, 2019
Messages
648
For giggles, did you check the power and data cable connections? Could be a wonky connection due to heat, gravity, vibration... (Or bad planet alignment, stray neutrino, forgot to pray to the silicon gods, etc. :cool: )
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
Did you check the power and data cable connections?

Not yet. It's an M.2, so, I really don't expect that anything can jiggle loose. And, sure, why not, I can remove it, and stick it back in, just because.
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
Smart isn't being super-helpful. "Error info log entries" but those could be literally anything; no spares used; manufacturer drive life at 1% used, 99% remaining. It's cool and didn't log heat issues at any point.

Used https://media.kingston.com/support/downloads/MKP_521.6_SMART-DCP1000_attribute.pdf to interpret these smart data fields.

Code:
freenas# smartctl -a /dev/nvme0
smartctl 7.1 2019-12-30 r5022 [FreeBSD 12.1-STABLE amd64] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Patriot Scorch M2
Serial Number:                      0xdeadbeef-don't-register-my-drive-bruh
Firmware Version:                   E8FM11.5
PCI Vendor/Subsystem ID:            0x1987
IEEE OUI Identifier:                0x6479a7
Total NVM Capacity:                 128,035,676,160 [128 GB]
Unallocated NVM Capacity:           0
Controller ID:                      0
Number of Namespaces:               1
Namespace 1 Size/Capacity:          128,035,676,160 [128 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            6479a7 115327fde8
Local Time is:                      Tue Jul 14 09:25:08 2020 EDT
Firmware Updates (0x02):            1 Slot
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x001e):     Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     84 Celsius
Critical Comp. Temp. Threshold:     88 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
0 +     3.00W       -        -    0  0  0  0        0       0
1 +     2.00W       -        -    1  1  1  1        0       0
2 +     2.00W       -        -    2  2  2  2        0       0
3 -   0.1000W       -        -    3  3  3  3     1000    1000
4 -   0.0050W       -        -    4  4  4  4   400000   90000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
0 +     512       0         1
1 -    4096       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        22 Celsius
Available Spare:                    100%
Available Spare Threshold:          50%
Percentage Used:                    1%
Data Units Read:                    748,279 [383 GB]
Data Units Written:                 567,665 [290 GB]
Host Read Commands:                 22,466,116
Host Write Commands:                15,176,756
Controller Busy Time:               331
Power Cycles:                       41
Power On Hours:                     12,816
Unsafe Shutdowns:                   28
Media and Data Integrity Errors:    0
Error Information Log Entries:      462
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 2:               22 Celsius

Error Information (NVMe Log 0x01, max 16 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0        462     0  0x0005  0x0005      -   4009754624     0     -
  1        461     0  0x0004  0x0005      -   4009755904     0     -
  2        460     0  0x0005  0x0005      -   4009754624     0     -
  3        459     0  0x0004  0x0005      -   4009755904     0     -
  4        458     0  0x0005  0x0005      -   4009754624     0     -
  5        457     0  0x0004  0x0005      -   4009755904     0     -
  6        456     0  0x0005  0x0005      -   4009754624     0     -
  7        455     0  0x0004  0x0005      -   4009755904     0     -
  8        454     0  0x0005  0x0005      -   4009754624     0     -
  9        453     0  0x0004  0x0005      -   4009755904     0     -
10        452     0  0x0005  0x0005      -   4009754624     0     -
11        451     0  0x0004  0x0005      -   4009755904     0     -
12        450     0  0x0005  0x0005      -   4009754624     0     -
13        449     0  0x0004  0x0005      -   4009755904     0     -
14        448     0  0x0005  0x0005      -   4009754624     0     -
15        447     0  0x0004  0x0005      -   4009755904     0     -
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
zpool status - scrub doesn't find issues either. Hmm. I may just watch it for now and see. I do wish I knew what to make of my increasing error count on those two LBAs.


Code:
  pool: freenas-boot
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 0 days 00:00:07 with 0 errors on Wed Jul  8 03:45:07 2020
config:

        NAME          STATE     READ WRITE CKSUM
        freenas-boot  ONLINE       0     0     0
          nvd0p2      ONLINE       0     0     0

errors: No known data errors
 

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
Patriot didn't even want to troubleshoot and just issued an RMA. I guess that answers that: It's going bad.

I'm going for an Optane 16GB with the theory that I can boot from it. Wish me success :).
 
Joined
Jul 2, 2019
Messages
648

Attachments

  • praying-to-the-computer-gods.jpg
    praying-to-the-computer-gods.jpg
    117.5 KB · Views: 305
Last edited:

Yorick

Wizard
Joined
Nov 4, 2018
Messages
1,912
I sacrificed a SAS-to-SAS breakout cable, and the computer gods were kind. Intel Optane 16GB works just fine as a boot drive.
 
Joined
Jul 2, 2019
Messages
648
@Yorick - You did complete this by sacrificing something to the computer gods, right? If not, you need to do this quickly! :tongue:
 
Top