smartctl does not report any SMART attributes

Status
Not open for further replies.

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Hello!

I'm posting in Offtopic, because I'm not sure if this is actually a FreeNAS/FreeBSD related issue.

While replaceing some of my disks I noticed that no SMART information seems to be available for one of my drives, a Samsung SpinPoint F4 HD204UI. Manufacturing date is 2011.02, so it should not be affected by the firmware bug for which the warning is issued. It also definitely has SMART capabilities.

Right now my pool is resilvering, but when it's done I could try again with FreeNAS-9.1-RC1 or some Linux OS.

I'm currently running FreeNAS-8.3.1-p2-x64.

Some output:
Code:
# smartctl -a -q noserial /dev/ada0
smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.3-RELEASE-p7 amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
=== START OF INFORMATION SECTION ===
Model Family:     SAMSUNG SpinPoint F4 EG (AFT)
Device Model:     SAMSUNG HD204UI
Firmware Version: 09570115
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Tue Jul 23 18:55:41 2013 CEST
 
==> WARNING: Using smartmontools or hdparm with this
drive may result in data loss due to a firmware bug.
****** THIS DRIVE MAY OR MAY NOT BE AFFECTED! ******
Buggy and fixed firmware report same version number!
See the following web pages for details:
http://knowledge.seagate.com/articles/en_US/FAQ/223571en
http://sourceforge.net/apps/trac/smartmontools/wiki/SamsungF4EGBadBlocks
 
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
 
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
 
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x00)         Offline data collection not supported.
SMART capabilities:            (0x0000) Automatic saving of SMART data                                  is not implemented.
Error logging capability:        (0x00) Error logging NOT supported.
                                        No General Purpose Logging support.
 
SMART Error Log not supported
SMART Self-test Log not supported
Device does not support Selective Self Tests/Logging


Code:
# smartctl -t short /dev/ada0
smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.3-RELEASE-p7 amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
 
=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Warning: device does not support Self-Test functions.
 
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
(pass0:ahcich0:0:0:0): SMART. ACB: b0 d4 01 4f c2 40 00 00 00 00 00 00
(pass0:ahcich0:0:0:0): CAM status: ATA Status Error
(pass0:ahcich0:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
(pass0:ahcich0:0:0:0): RES: 51 04 01 4f c2 40 00 00 00 00 00
Command "Execute SMART Short self-test routine immediately in off-line mode" failed: No error: 0


Code:
# camcontrol identify ada0
pass0: <SAMSUNG HD204UI 09570115> ATA-7 SATA 2.x device
pass0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 512bytes)
 
protocol              ATA/ATAPI-7 SATA 2.x
device model          SAMSUNG HD204UI
firmware revision     09570115
serial number         <omitted>
cylinders             16383
heads                 16
sectors/track         63
sector size           logical 512, physical 512, offset 0
LBA supported         268435455 sectors
LBA48 supported       3907029168 sectors
PIO supported         PIO4
DMA supported         WDMA2 UDMA6
 
Feature                      Support  Enabled   Value           Vendor
read ahead                     yes      yes
write cache                    yes      yes
flush cache                    yes      yes
overlap                        no
Tagged Command Queuing (TCQ)   no       no
Native Command Queuing (NCQ)   no
SMART                          yes      yes
microcode download             no       no
security                       yes      no
power management               yes      yes
advanced power management      yes      no      0/0x00
automatic acoustic management  no       no
media status notification      no       no
power-up in Standby            no       no
write-read-verify              no       no
unload                         no       no
free-fall                      no       no
data set management (TRIM)     no


The other hardware is listed in my signature.

Maybe somebody did already run into the same problem?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I'm confused too.

Are you using the on-board SATA controller? Is it in AHCI? Is it an Intel controller or a no-name?

You've probably thought of doing these things, but I'll mention them anyway.

1. I'd try a linux live CD and see if you SMART attributes from it.
2. I'd also try a different computer. Preferably one that has an Intel controller as I've never had an Intel controller disappoint. But any controller that you know works with SMART will also work.

Personally, I'm not a fan of Zotac. I've known 2 people that have bought them because they are often small but feature packed. But both of my friends had weird weird issues with them. Ironically, both were SATA issues. Both said that some SATA ports wouldn't boot from the disk no matter what. Changed the SATA port the hard drive was attached to and the computer booted flawlessly.

Just some food for thought...
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
Thanks for your reply! I'm using the on-board controller, but I think 4 ports run natively with an Intel controller and 2 extra ports are using some other controller (JMicron?) added by Zotac. Everything is running in AHCI.

After switching the SATA ports, I can now read the SMART values of the Samsung. Seems like only ada0 can't access any smart information. It also now correctly reports the ATA Version with 8.

I just checked, it's an Intel SATA + JMicron RAID controller:
Code:
ahci1@pci0:0:31:2:      class=0x010601 card=0x174b174b chip=0x27c18086 rev=0x02 hdr=0x00
    vendor    = 'Intel Corporation'
    device    = '82801GB I/O Controller Hub SATA cc=AHCI'
    class      = mass storage
    subclass  = SATA
atapci0@pci0:1:0:0:    class=0x010400 card=0x2363197b chip=0x2363197b rev=0x03 hdr=0x00
    vendor    = 'JMicron Technology Corp.'
    device    = 'JMicron JMB362/JMB363 AHCI Controller (JMB36X)'
    class      = mass storage
    subclass  = RAID


When it's time for the next upgrade, I'll definitely go for a proper server board like Supermicro and a bigger case with better cooling (my HDDs run quite hot..).
So I obviously missed all SMART reports from that device in the last years. Next time better equipment and better testing! If the device had failed, only a scrub would have shown me the errors..
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Can you define "quite hot"? Anything above 40C is bad for hard drives per the Google white paper on hard drives in large populations.
 

warri

Guru
Joined
Jun 6, 2011
Messages
1,193
While scrubbing the temperatures did peak up to 55°C for a short times this summer - which yielded in a Temperature Airflow SMART warning (which promptly arrived via email). Since then I avoided scrubs and heavy usage on hot days. Nowadays they usually range between 40 and 50°C.

Consequently, I have opened the case and placed it on a more ventilated spot, and I'm seeing around 40-42°C now.

I'm aware that the temperatures are less than optimal for my drive's life expectancy, but I can't really change anything about it at the moment besides not putting too much load on it on hot days. I can't fit in more fans, since the case is packed already. Additionally the CPU is passive-cooled and tends to heat up the interior when the lid is closed.

[edit]
To visualize the problem:
IMG_20130721_125428.jpg
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Holy smokes! Sounds like one server I worked on months ago. I reached in to pull a hard drive out of its slot and the hard drive was so hot to the touch I let go(fortunately it was on the table already so it didn't fall to the floor). It had just come out of the system seconds before. I did a smartctl check and found out that all of his hard drives were 55-57C and many of them were throwing errors all over the place. As soon as we upgraded his fans to higher RPM and MUCH higher CFM the random disk errors stopped. Drives have been error free since(except 2 that have since failed).

One thing I've definitely noticed from helping many people one-on-one with Skype and Teamviewer is that the fractal design seems to be really really rough for hard drive temps. One system was almost 60C average for their hard drives.

I'm convinced that smallish systems that aren't using the "Green" or "low power" hard drives are guaranteed to kill hard drives in short order. This realization is quite disappointing because so many people refused to build a standard size server(mid-ATX or bigger) and there's actually a whole bunch of FreeNAS servers out there cooking their hard drives. When those things start failing its going to suck for many users because I'm betting many people will see multiple drive failures in a very short period of time.

Even I had my own temperature issues this year and I thought I had myself set up for success. I thought I had put adequate thought into my cooling strategy for my server(and I'm using WD Green drives) but I still ended up with temps over 40C.

I've often wondered if many of us have been unintentionally cooking our hard drives for the last 10+ years and blaming the manufacturers for the high failure rates. My server has had 24 Green drives, 18 of them for more than 3 years, and 6 of them for more than 2 years. I have had 1 that died suddenly in February, and I had 4 that died within 14 days a few months ago when the drives were cooking in my case. Other than that, I've had excellent results from my hard drives and many people are shocked at my low failure rates in my server. I'm clearly far better than the typical failure rates for hard drives so I must be doing something right. Based on my multiple drive failures during my high temperatures and the disks temps over their lifespan in my servers, I'm convinced that the drive temps may be a bigger killer of hard drives than most people will ever admit to.

At this point I never install hard drives in a computer that doesn't have a fan that directly blows air on a hard drive.
 
Status
Not open for further replies.
Top