Device: /dev/da0 [SAT], 1 Currently unreadable (pending) sectors

poldas

Contributor
Joined
Sep 18, 2012
Messages
104
Hi

FreeNAS FreeNAS-11.2-U7

DELL R720xd
H310 min flashed to 9211-8i IT
8x Intel SSD D3-S4510 1.92TB
Intel SSD DC P3700 - SLOG

smartd reports disk issue

1579163953990.png


Code:
Jan 16 19:16:08 freenas smartd[5716]: Device: /dev/da5 [SAT], 6 Currently unreadable (pending) sectors
Jan 16 19:16:08 freenas smartd[5716]: Device: /dev/da3 [SAT], 20 Currently unreadable (pending) sectors
Jan 16 19:16:08 freenas smartd[5716]: Device: /dev/da1 [SAT], 2 Currently unreadable (pending) sectors
Jan 16 19:16:08 freenas smartd[5716]: Device: /dev/da0 [SAT], 1 Currently unreadable (pending) sectors
Jan 16 19:46:08 freenas smartd[5716]: Device: /dev/da5 [SAT], 6 Currently unreadable (pending) sectors
Jan 16 19:46:08 freenas smartd[5716]: Device: /dev/da3 [SAT], 20 Currently unreadable (pending) sectors
Jan 16 19:46:08 freenas smartd[5716]: Device: /dev/da1 [SAT], 2 Currently unreadable (pending) sectors
Jan 16 19:46:08 freenas smartd[5716]: Device: /dev/da0 [SAT], 1 Currently unreadable (pending) sectors
Jan 16 20:16:09 freenas smartd[5716]: Device: /dev/da5 [SAT], 6 Currently unreadable (pending) sectors
Jan 16 20:16:09 freenas smartd[5716]: Device: /dev/da3 [SAT], 20 Currently unreadable (pending) sectors
Jan 16 20:16:09 freenas smartd[5716]: Device: /dev/da1 [SAT], 2 Currently unreadable (pending) sectors
Jan 16 20:16:09 freenas smartd[5716]: Device: /dev/da0 [SAT], 1 Currently unreadable (pending) sectors


I have made smart tests (smartctl -t long /dev/daX) for each disk but there were no errors (in attachment smart tests)

Can you check the smart logs and confirm that disks are OK or NOT? If disk are OK why smartd reports problem with them?

Thank you for help
 

Attachments

  • smart_ada0.txt
    5.6 KB · Views: 305
  • smart_ada1.txt
    5.5 KB · Views: 285
  • smart_ada3.txt
    5.5 KB · Views: 317
  • smart_ada5.txt
    5.5 KB · Views: 285
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
something is very wrong. ensure you have a backup.
according to SMART, your drives are all almost exactly the same age, ~160 days old
your drives have pending sectors, that's, afaik, a huge red flag for impending drive failure.
my guess is your SSD's have reached their write limit, and need to be replaced ASAP, or you have something bad with the controller/cables/backplane.
 

poldas

Contributor
Joined
Sep 18, 2012
Messages
104
didn't use the storage a lot, only some speed tests so I pretty sure that disks haven't reached write limit. Cable, backplane, cotroller hard to say... Maybe disks come from faild production. I requested to Intel, they siad:

"send us the SMART log from the Intel® SSD Data Center Tool (Intel® SSD DCT). Our drives differ from other manufacturers and some values are not shown or misread"
 
Last edited:

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
blad produkcyjnyI
uh, what?
didn't use the storage a lot
ya, that's why I said it was a guess, it'd be nice if the TBW stat could be IN the smart details. the spec for those drives seems to say 7PB, which is....loads, but since I dont know your workload I can't really tell.
I will reiterate the part about backups though, since I don't see any mention of them.
 

poldas

Contributor
Joined
Sep 18, 2012
Messages
104
Storage isn't use in production only in test so backup isn't necessary
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
Storage isn't use in production only in test so backup isn't necessary
ah, that's a whole different context text then. lets burn it all down!
since its so many drives, i would suspect common items, like the controller/backplane/cables, before the drives. can you make a temporary adhoc setup that tests different parts of your storage setup? also, can you tell if the 4 drives happen to be connected to the same miniSAS on the HBA maybe? or are going through the same miniSAS cable?
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
You may want to read this - https://www.intel.com/content/www/u...7708/memory-and-storage/data-center-ssds.html

Current Pending Sector Count is attribute C5h.

The impact of the change is customers will see an increased value for SMART attributes 05h and C5h on their Intel SSD DC S4X00 Series products compared to the Intel SSD DC S3610 Series products. However, the increased SMART attribute C5h raw value is only advisory and isn't tied to a warranty. An increased C5h and/or 05h raw value doesn't mean the drive is bad or is going to fail, nor does it affect the endurance, performance or capacity of the drive. Seeing non-0 values in SMART C5h/05h even on an almost brand-new SSD is both normal and common.

This seems to be a feature of these SSDs, not an issue per se. Alerting systems need to adjust, and some day they probably will.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
it seems weird that only 1/2 of the drives would do it though if its a part of the design.
 

AlexGG

Contributor
Joined
Dec 13, 2018
Messages
171
Yes, the question still stands, if there is any commonality between these four. Like a power rail/cabling, or controller port, or data cabling, or something else.
 
Joined
May 10, 2017
Messages
838
Intel said: "updated the firmware the problem has been fixed" and it is

Good for Intel, Crucial has a similar issue with the MX500 where it's contantly reporting 1 pending sector and according to their support it's normal, not a firmware bug.
 
Top