Can TrueNAS report health of 2.5" SSD and NVMe SSD reliably?

Joined
Jan 14, 2023
Messages
38
Hello, somewhere I read that Synology NAS could report health condition of 2.5"SATA SSD and HDD via S.M.A.R.T. but it cannot do so for NVMe SSD because this type of SSD implements S.M.A.R.T. differently (Something like that. Don't recall exactly). Can TrueNAS report health of all these type of storage reliably? I consider to use consumer grade SSD such as Samsung 970 Evo Plus, 870 EVO 2.5" SATA and WD Black.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Code:
root@freenas[~]# smartctl -a /dev/nvme1
smartctl 7.2 2021-09-14 r5236 [FreeBSD 13.1-RELEASE-p2 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 970 EVO Plus 1TB
Serial Number:                      XXXXXXXXXXXXX
Firmware Version:                   2B2QEXM7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 1,000,204,886,016 [1.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      4
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,000,204,886,016 [1.00 TB]
Namespace 1 Utilization:            562,452,213,760 [562 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 57019d11cf
Local Time is:                      Wed Feb  8 07:22:13 2023 CET
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x03):         S/H_per_NS Cmd_Eff_Lg
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     7.80W       -        -    0  0  0  0        0       0
 1 +     6.00W       -        -    1  1  1  1        0       0
 2 +     3.40W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3      210    1200
 4 -   0.0100W       -        -    4  4  4  4     2000    8000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        43 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    13%
Data Units Read:                    24,289,521 [12.4 TB]
Data Units Written:                 379,864,201 [194 TB]
Host Read Commands:                 399,950,229
Host Write Commands:                7,319,218,764
Controller Busy Time:               140,941
Power Cycles:                       16
Power On Hours:                     21,655
Unsafe Shutdowns:                   9
Media and Data Integrity Errors:    0
Error Information Log Entries:      7
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               43 Celsius
Temperature Sensor 2:               53 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
No Errors Logged
 

ChrisRJ

Wizard
Joined
Oct 23, 2020
Messages
1,919
Hello, somewhere I read that Synology NAS could report health condition of 2.5"SATA SSD and HDD via S.M.A.R.T. but it cannot do so for NVMe SSD because this type of SSD implements S.M.A.R.T. differently (Something like that. Don't recall exactly). Can TrueNAS report health of all these type of storage reliably?
There is a wonderful script, maintained by @joeschmuck (link below), which I assume can do this. Joe may be able to comment on your specific question.


I consider to use consumer grade SSD such as Samsung 970 Evo Plus, 870 EVO 2.5" SATA and WD Black.
Those ar TLC and not QLC drives, right? The write endurance of QLC drives can be an issue, depending on the workload. What do you want to use those drives for?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Can TrueNAS report health of all these type of storage reliably? I consider to use consumer grade SSD such as Samsung 970 Evo Plus, 870 EVO 2.5" SATA and WD Black.
Unfortunately NVMe's provide very little data compared to a HDD or SSD, this is not a TrueNAS thing. The typical predictors we are use to seeing with a hard drive are gone, the technology has changed. Below is a screenshot of a few NVMe's for the script that Chris mentioned. I have no reason to think TrueNAS would not support the basic critical values (listed below).

Code:
"smart_status": {
    "passed": true,

"nvme_smart_health_information_log": {
    "temperature": 42,
    "available_spare": 100,


Which are (in order):
SMART Status
Temperature
Wear Level (not personally positive TrueNAS reports this but they should)

There is also a value called "Critical Warning" but I'm not sure if TrueNAS implements that, but the script does.

InkedScreenshot 2023-02-08 054933 (002).jpg

(NOTE: Wear Level of 100 is like new)
I consider to use consumer grade SSD such as Samsung 970 Evo Plus, 870 EVO 2.5" SATA and WD Black.
I would not recommend this type of device for a ZFS file system. @ChrisRJ is correct, endurance is an issue.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Depends on the usage. 970 EVO Plus have a TBW of 600 or 5 years for the 1 TB model and are performing great here as single mirror pools for VMs and jails.

As you can see from the output I posted above I have written slightly less than 200 TB in about 2.5 years.
 
Joined
Jan 14, 2023
Messages
38
I don't plan to have the system on 24/7. Just turn it on whenever I need to use it.

Actually in practice, how useful are these data besides warning us that a drive is going to fail and needs to be replaced?

So, in terms of the availability of S.M.A.R.T data, NAS HDD offers the most, followed by 2.5" SSD and then NVMe SSD?
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Actually in practice, how useful are these data besides warning us that a drive is going to fail and needs to be replaced?
Well, warning that a device is about to fail seems pretty useful to me…
You'd do right to heed such warnings and promptly replace any affected drive. A failed SMART test is definitely ground for RMA.

So, in terms of the availability of S.M.A.R.T data, NAS HDD offers the most, followed by 2.5" SSD and then NVMe SSD?
Actually SATA HDDs, followed by SAS HDDs and SSDs. The issue with SSDs is that electronics tend to "go poof" while magnetic storage platters tend to grow defects progressively.
 
Joined
Jan 14, 2023
Messages
38
Well, warning that a device is about to fail seems pretty useful to me…
You'd do right to heed such warnings and promptly replace any affected drive. A failed SMART test is definitely ground for RMA.


Actually SATA HDDs, followed by SAS HDDs and SSDs. The issue with SSDs is that electronics tend to "go poof" while magnetic storage platters tend to grow defects progressively.

Thanks for the warning. I will not put HDD and SSD near each other. For my current desktop PC, I use only NVMe SSD. Working fine for 5+ years.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Actually in practice, how useful are these data besides warning us that a drive is going to fail and needs to be replaced?
Here's the thing (I sound like a politician), SMART is an "attempt" to warn the user that a failure will happen within a 24 hours period. But not all faults can be predicted. The drive motor electronics could fail at any instance for example. As to SSD/NVMe, as @Etorix said, "poof" but we generally use Wear Level as that is the only one that is a reliable predictor. SMART is better than not having SMART. For HDD's there are a lot of indicators but it could also go "poof" without warning.
I will not put HDD and SSD near each other.
That was funny, made me laugh.
 

Etorix

Wizard
Joined
Dec 30, 2020
Messages
2,134
Thanks for the warning. I will not put HDD and SSD near each other.
I meant that SSDs tend to die suddenly, I did not mean to imply that they explode or burst in flames and damage their surroundings.
My apologies for any confusion I may have caused.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
Joined
Jan 14, 2023
Messages
38
I meant that SSDs tend to die suddenly, I did not mean to imply that they explode or burst in flames and damage their surroundings.
My apologies for any confusion I may have caused.
Sorry I misread your statement. I thought the magnet of the HDD could mess up with the data stored in the SSD.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994
Sorry I misread your statement. I thought the magnet of the HDD could mess up with the data stored in the SSD.
And I thought you were joking. I had no idea it was a serious response. Nope, I've never seen a HDD, SSD, or even NVMe go up in flames. I have seen a HDD fry the electrical connector of a "Hot Swap Drive Tray" when the power was on. That hard drive never spun up again. Melted connector. That is why I do not Hot Swap drives. And it was a Supermicro chassis.
 

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
And I thought you were joking. I had no idea it was a serious response. Nope, I've never seen a HDD, SSD, or even NVMe go up in flames. I have seen a HDD fry the electrical connector of a "Hot Swap Drive Tray" when the power was on. That hard drive never spun up again. Melted connector. That is why I do not Hot Swap drives. And it was a Supermicro chassis.
Was it a SAS or SATA drive?
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,994

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194

Whattteva

Wizard
Joined
Mar 5, 2013
Messages
1,824
SATA and surprised the hell out of me.
Dang, sure as heck would surprise me too. I'm glad I've always chosen to power off the server before I swap drives around.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
@Ericloewe Looks like the regular updates generate way more write requests than I do read just using the applications. And the pool has a 99-100% ARC hit rate in regular operation.
 
Joined
Jan 14, 2023
Messages
38
Hi, some Synology users mentioned that it is not good to use SSD in a NAS as continuing writings can kill a SSD quickly. It also seems that Samsung 2.5" EVO SSD does not play well with Synology NAS as people are complaining sudden dead of their SSD. Not sure about the cause. How is the situation using SSD (2.5" SATA3, NVMe) on TrueNAS systems? If I don't turn on the NAS (TrueNAS) 24/7, is it fine to use SATA III 2.5" SSD and NVMe SSD?

Have been using NVMe SSD the past few years without issue. 20 years ago got several bad experience with HDD suddenly having clicking sounds and die.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
General-purpose NAS hasn't really been optimized for NVMe yet, and is still pretty much geared towards SATA interfaces. That being said, it's not the drive interface you should be concerned about, but the underlying SSD NAND technology. The reason EVO SSDs are dying is because they're using TLC or QLC, which has much, much poorer endurance than SLC or MLC. Running it on a desktop isn't a good basis for reliability comparison, because that workload is vastly different from a NAS workload, which is much more write heavy than a desktop.
 
Top