Running SMART tests on Samsung 970 EVO

Status
Not open for further replies.
Joined
Oct 18, 2018
Messages
969
Hey folks,

Apologies if there is a thread already related to this, I was unable to find it in my searches. My basic issue is that I am seemingly unable to run the short and long tests on my Samsung 970 EVO drives despite their website listing these drives as supporting SMART.

This is a test system I put together out of parts I had lying around plus a few purchases I plan to use for other things. I intend to get more familiar with FreeNAS on this system prior to using a more appropriate MOBO and adding more drives and vdevs.

Build FreeNAS-11.1-U6
Platform Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz
Memory 16196MB
Motherboard Asus ROG STRIX H370-I GAMING
Drives
2x Samsung SSD 970 EVO 250GB
1x Western Digital SSD WDC WDS120G2G0A-00JH30 120GB
2x Seagate Constellation DHH ST3000NM0033-9ZM178 3TB

Pools
Boot Pool
vdev: Mirrored 2x Samsung 970s
Boot Pool
vdev: WD SSD
Storage Pool
vdev: Mirrored 2x Seagate Drives

I have two boot pools because my first boot device, the WD SSD, reported the following error in the UI
Code:
The boot volume state is DEGRADED: 
One or more devices has experienced an error resulting in data corruption. Applications may be affected.


I then reinstalled the same version of FreeNAS on the mirrored 2x Samsung 970s. This was a great bit of practice reimporting encrypted storage pools and importing configuration.

One of the drives in the Samsung pool lists the same error at boot time.

As part of my learning process I wanted to go through system burn in, this was a great opportunity. I rebooted the system and loaded the OS off of the single WD SSD. This is where things got a bit strange. Put simply, I cannot seem to get the short or long SMART tests to work on the Samsung drives. Samsung's site lists them as having SMART support but smartctl -i <device> does not list it.

How I tried to run the tests.

Code:
$ geom disk list
Geom name: nvd0
Providers:
1. Name: nvd0
   Mediasize: 250059350016 (233G)
<truncated>
Geom name: nvd1
Providers:
1. Name: nvd1
   Mediasize: 250059350016 (233G)
<truncated>
Geom name: ada0
Providers:
1. Name: ada0
   Mediasize: 120040980480 (112G)
<truncated>
Geom name: ada1
Providers:
1. Name: ada1
   Mediasize: 3000592982016 (2.7T)
<truncated>
Geom name: ada2
Providers:
1. Name: ada2
   Mediasize: 3000592982016 (2.7T)
<truncated>


$ sudo smartctl -t short /dev/nvd0
Password:
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/nvd0: Unable to detect device type
Please specify device type with the -d option.


Use smartctl -h to get a usage summary

nvd drives seem to not be supported by smartctl. If I use -d nvme I get a hint though

Code:
$ sudo smartctl -t short -d nvme /dev/nvd0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

Smartctl open device: /dev/nvd0 failed: NVMe controller controller/namespace ids must begin with '/dev/nvme'


I looked and sure enough, there exists a /dev/nvme0 and /dev/nvme1 device despite these drives being listed as nvd from geom. I used smartctl -i /dev/nvme0 to make sure that this is the device I expect, and sure enough I see Model Number: Samsung SSD 970 EVO 250GB

Okay great, so I try the tests.

Code:
$ sudo smartctl -t short /dev/nvme0
NVMe device successfully opened

Use 'smartctl -a' (or '-x') to print SMART (and more) information


This output is markedly different from what I see if I run the same command on say one of the SSDs or HDDs.

Code:
$ sudo smartctl -t short /dev/ada0
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 2 minutes for test to complete.
Test will complete after Wed Nov  7 20:54:13 2018

Use smartctl -X to abort test.


Further, I cannot seem to find any test results for either the short or long tests on either nvme drive.

So, how do I run SMART tests on these drives? And why does geom report them as nvd0 and nvd1 yet smartctl only is able to display information about nvme0 and nvme1?

I truncated some of the output for readability, happy to provide fuller logs if they prove useful.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
You don't run SMART tests on nvd devices. nvd devices are NVMe namespaces, of which an NVMe drive has one or more. You run SMART on nvme devices.

How do you figure out which nvd device belongs to which nvme device? Good question, be sure to tell me if you figure that one out.
 
Joined
Oct 18, 2018
Messages
969
Thanks for the info. I did try to run the SMART tests on the /dev/nvme0 and /dev/nvme1 as above and it appears as though no tests ran.

Code:
$ sudo smartctl -t short /dev/nvme0
NVMe device successfully opened

Use 'smartctl -a' (or '-x') to print SMART (and more) information
 
Joined
May 10, 2017
Messages
838
AFAIK NVMe devices don't run SMART tests, if you list all the SMART info there's no place for the test results, like with normal ATA devices.
 
Joined
Oct 18, 2018
Messages
969
Interesting. So from Samsung's site listing the devices as supporting SMART perhaps they simply mean SMART reporting and not SMART tests?
 
Joined
May 10, 2017
Messages
838
Yes, SMART isn't just the tests, SMART for NVMe devices is completely different than ATA devices, e.g., this is the SMART output of one of my devices, very different from an ATA device, your 970 EVO should look similar.

Code:
smartctl -x /dev/nvme0
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.15-unRAID] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:					   TOSHIBA-RD400
Serial Number:					  664S107XTPGV
Firmware Version:				   57CZ4102
PCI Vendor/Subsystem ID:			0x1b85
IEEE OUI Identifier:				0xe83a97
Controller ID:					  0
Number of Namespaces:			   1
Namespace 1 Size/Capacity:		  512,110,190,592 [512 GB]
Namespace 1 Formatted LBA Size:	 512
Namespace 1 IEEE EUI-64:			e83a97 02000018f5
Local Time is:					  Fri Nov  9 17:06:14 2018 GMT
Firmware Updates (0x02):			1 Slot
Optional Admin Commands (0x0007):   Security Format Frmw_DL
Optional NVM Commands (0x000e):	 Wr_Unc DS_Mngmt Wr_Zero
Warning  Comp. Temp. Threshold:	 78 Celsius
Critical Comp. Temp. Threshold:	 82 Celsius

Supported Power States
St Op	 Max   Active	 Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +	 6.00W	   -		-	0  0  0  0		0	   0
 1 +	 2.40W	   -		-	1  1  1  1		0	   0
 2 +	 1.90W	   -		-	2  2  2  2		0	   0
 3 -   0.1600W	   -		-	3  3  3  3	 1000	1000
 4 -   0.0120W	   -		-	4  4  4  4	 5000   35000
 5 -   0.0060W	   -		-	5  5  5  5   100000  110000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +	 512	   0		 2
 1 -	4096	   0		 1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:				   0x00
Temperature:						34 Celsius
Available Spare:					100%
Available Spare Threshold:		  10%
Percentage Used:					30%
Data Units Read:					76,603,472 [39.2 TB]
Data Units Written:				 244,232,117 [125 TB]
Host Read Commands:				 1,766,703,957
Host Write Commands:				3,175,814,170
Controller Busy Time:			   13,282
Power Cycles:					   105
Power On Hours:					 4,665
Unsafe Shutdowns:				   25
Media and Data Integrity Errors:	0
Error Information Log Entries:	  0
Warning  Comp. Temperature Time:	0
Critical Comp. Temperature Time:	0
Temperature Sensor 1:			   34 Celsius

Error Information (NVMe Log 0x01, max 128 entries)
No Errors Logged
 
Joined
Oct 18, 2018
Messages
969
That is very similar to what I'm seeing, thank you. I suppose now I'll do some research into how to determine what exactly is causing my degraded pool consisting of 2 NVMe drives. I was hoping SMART tests would reveal some information.
 
Status
Not open for further replies.
Top