Shell (awk) script to show SMART attribute overview

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
I was going to post this in the "usefull scripts" section, but I don't have permission to post there. Maybe it can be moved if it's deemed worthy.

I figured I'd start a new thread as this deviates a bit from the original thread. I shamelessly used this thread to get a starting point:

http://forums.freenas.org/index.php...-hdd-mobo-gpu-temperatures-on-freenas-8.2994/

I didn't really care about system load, or cpu temperature, but I wanted to expand what's shown about smart info. The 'view disks' section is nice, but it doesn't show any smart detail. I wanted to something to give me a brief overview of my disks including certain smart attributes I'm interested in. It's a bit of a kludge, as my awk skills are minimal, but it works.

Code:
#!/bin/sh
for i in $(sysctl -n kern.disks | awk '{for (i=NF; i!=0 ; i--) print $i }' )
do
  (echo "DeviceLabel: $i" ; smartctl -v 7,hex48 -a /dev/$i) | awk '\
  /DeviceLabel:/{DevLabel=$2;} \
  /Temperature_Celsius/{DevTemp=$10;} \
  /Serial Number:/{DevSerNum=$3;} \
  /Seek_Error_Rate/{SER1=("0x" substr($10,3,4));SER2=("0x" substr($10,7))}; \
  /Device Model:/{DevName=$3} \
  /Power_On_Hours/{POH=$10} \
  /Firmware Version:/{FIRMWARE=$3} \
  /Reallocated_Sector/{ReAlloc=$10} \
  /Reported_Uncorrect/{RepUnc=$10} \
  /Current_Pending_Sector/{Pending=$10} \
  /Offline_Uncorrectable/{OffUnc=$10} \
  END { if (DevTemp!="") printf "%s %s %s %s FW:%s ReAlloc:%d RepUnc:%d CurPend:%d OffUnc:%d SeekErrs:%d/%d POH:%s\n", DevLabel,DevTemp,DevSerNum,DevName,FIRMWARE,ReAlloc,RepUnc,Pending,OffUnc,SER1,SER2,POH; }'
done


It's slightly Seagate specific as I wanted it to do the math for me on the seek error rate as per this post:

http://forums.freenas.org/index.php...ard-drive-with-smart-errors.12396/#post-57928

Here's the output merged from both of my nas's:

Code:
ada3 30 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/11889515 POH:2197
ada4 30 xxxxxxxx ST3000DM001-1CH166 FW:CC29 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:1/54988652 POH:11319
ada5 30 xxxxxxxx ST3000DM001-1ER166 FW:CC43 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/13159 POH:49
ada6 30 xxxxxxxx ST3000DM001-1CH166 FW:CC29 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/111064233 POH:16948
da0 32 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:1/129685748 POH:16937
da1 32 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/144969052 POH:16937
da2 30 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/142269575 POH:19996
da3 33 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/149345261 POH:19997
da4 32 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:2/122724724 POH:17649
da5 29 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:8 RepUnc:2 CurPend:0 OffUnc:0 SeekErrs:3/76035425 POH:7680
da6 31 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/135202798 POH:17643
da7 30 xxxxxxxx ST3000DM001-1CH166 FW:CC26 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/2155838 POH:185
ada0 25 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/15730584 POH:1595
ada1 25 xxxxxxxx ST3000DM001-1CH166 FW:CC46 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:4/64399532 POH:7867
ada2 26 xxxxxxxx ST3000DM001-1CH166 FW:CC47 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:1/66341677 POH:7759
ada3 26 xxxxxxxx ST3000DM001-1CH166 FW:CC47 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:11/68530628 POH:7764
ada4 26 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:1028/58718858 POH:8501
da1 25 xxxxxxxx ST3000DM001-1CH166 FW:CC24 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:1/171890156 POH:11120
da2 26 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:3/72767627 POH:8596
da3 25 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:63/63743669 POH:7684
da4 27 xxxxxxxx ST3000DM001-1CH166 FW:CC24 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:1/169728385 POH:11118
da5 25 xxxxxxxx ST3000DM001-1CH166 FW:CC47 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:5/75429183 POH:7736
da6 26 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:2/72385994 POH:8596
da7 27 xxxxxxxx ST3000DM001-1CH166 FW:CC27 ReAlloc:0 RepUnc:0 CurPend:0 OffUnc:0 SeekErrs:0/66934831 POH:8575

Bad drive I've replaced:

da0 29 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:0 RepUnc:384 CurPend:0 OffUnc:0 SeekErrs:1/131736618 POH:16920


And yes, I have smart monitoring setup, and get emails when any of these attributes change, but I like an overview all at once sometimes in addition to the smart warnings.

You can see one of the da5's has reallocated, and reported uncorrectable sectors. I'm keeping my eye on it. Still got warranty left, so it might be going back.

You can also see one of the ada4's has a fair number of seek errors. Something happened when it was a fairly new drive because I remember it being at 1023 seek errors out of only 500,000 or so. Now it's only got 1028 seek errors out of 58 million total seeks. Seems fine, as the seek errors are not going up significantly.

It would be nice if it would align to tab stops or something so it's nicer to look at, but I haven't figured out how to do that in awk.

Anyway, it's just something I run once in a while when I want a brief overview of my disks.
 

RobertT

Explorer
Joined
Sep 28, 2014
Messages
54
change your printf so that the strings are fixed lengths is what I would probably do.
So..
Code:
printf "%s %s %s %s FW:%s ReAlloc:%d RepUnc:%d CurPend:%d OffUnc:%d SeekErrs:%d/%d POH:%s\n", DevLabel,DevTemp,DevSerNum,DevName,FIRMWARE,ReAlloc,RepUnc,Pending,OffUnc,SER1,SER2,POH; }'

Would be something like
Code:
printf "%4s %3s %10s %20s FW:%5s ReAlloc:%3d RepUnc:%3d CurPend:%3d OffUnc:%3d SeekErrs:%-5d/%10d POH:%6s\n", DevLabel,DevTemp,DevSerNum,DevName,FIRMWARE,ReAlloc,RepUnc,Pending,OffUnc,SER1,SER2,POH; }'
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
Thanks, I'll give that a try. In the mean time I've been 'beating on' that bad drive with badblocks. Is it sad I find this kind of humorous?

Code:
da0 27 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:8392 RepUnc:2471 CurPend:2664 OffUnc:2664 SeekErrs:1/131965842 POH:16943
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Thanks, I'll give that a try. In the mean time I've been 'beating on' that bad drive with badblocks. Is it sad I find this kind of humorous?

Code:
da0 27 xxxxxxxx ST3000DM001-9YN166 FW:CC4H ReAlloc:8392 RepUnc:2471 CurPend:2664 OffUnc:2664 SeekErrs:1/131965842 POH:16943
Even though this is an older post, I thought I would mention that when I pull a bad drive from my system, I run dban boot and nuke on it. The 3 pass DOD wipe with a verify between passes. This is my best effort to make sure any data has been made unrecoverable, especially since it was encrypted, to begin with. After that, I usually retest it with smartctl in my Linux workstation to see how many bad or reallocated sectors there are. I find it interesting also, and I had one drive recently go from only 8 reallocated sectors when removed to 52,880 bad sectors when dban was done.
 

SMnasMAN

Contributor
Joined
Dec 2, 2018
Messages
177
this is a really cool and helpful script! thanks!

Im working on making some mods to it so that it will show the (few) SAS "smart" data points that are relevant to SAS drive health. will post here when im done.

(IMO- most of the sas datapoints you reall need to have graphed out to see if there are any bad looking trends, ie using a tool like win app HD Sentinel (best hdd tool anywhere btw)
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
this is a really cool and helpful script! thanks!

Im working on making some mods to it so that it will show the (few) SAS "smart" data points that are relevant to SAS drive health. will post here when im done.

(IMO- most of the sas datapoints you reall need to have graphed out to see if there are any bad looking trends, ie using a tool like win app HD Sentinel (best hdd tool anywhere btw)
You might want to take a look at this script that was designed to look at SAS drives:

The output: http://pastebin.com/56LuTdKe

The script: http://pastebin.com/veDv2FfZ
 
Top