SMART daemon crashes when trying to read my SSD

Status
Not open for further replies.

Thomymaster

Contributor
Joined
Apr 26, 2013
Messages
142
Hi guys

I have my FreeNAS running with 4 HDDs and one SSD as ZIL device. When i try to configure a SMART test and include my SSD (ada0), then the SMART daemon stops working (see screenshot attached), when i remove the SSD it works again. When i manually try "smartctl -a /dev/ada0" i get my SMART data.

What is the problem here?

Cheers

Thomy
 

Attachments

  • SMARTD Fehler.jpg
    SMARTD Fehler.jpg
    68.2 KB · Views: 307

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Bet that when you run the test manually you don't have a TEMPERATURE value. this actually might be something worth looking into however I also realize you're new here but you need to provide the required minimum data so we can give you proper advice.

In order for us to help you with your question we will need the following information to the best of your ability. We do not like to make assumptions here as we might give wrong advice and the next thing you know, your data is gone forever. Please follow the forum rules so that we can offer proper assistance. Below is the rule #3. Provide as much information as possible and don't assume we know exactly how your system is configured.

Including the following information in your thread will increase the chance you will get an answer:


  1. FreeNAS version and platform (32 or 64 bit).
  2. General hardware information (CPU, RAM, Motherboard model, etc.).
  3. Specific hardware information (Network card chipset, Raid controller chipset, etc.).
  4. DMESG output or copy of specific error message.
  5. IFCONFIG output if you are asking about a NIC or networking problem.
  6. PCICONF -lv output if you are asking about MotherBoard and / or PCI card problems.
  7. Code snippets, logs, config files and quotes should be enclosed in the appropriate bbcode tags.
  8. To provide you with accurate answers we need this information and it will also help others to find your posts when they are searching for similar information.

[THIS IS A CANNED RESPONSE :p]
 

Thomymaster

Contributor
Joined
Apr 26, 2013
Messages
142
Build FreeNAS-8.3.1-RELEASE-p2-x64 (r12686+b770da6_dirty)
Platform Intel(R) Xeon(R) CPU 3040 @ 1.86GHz
Memory 2031MB

HDDs: 4x Seagate Barracuda 7200.11 (ST31500341AS) configured as a RaidZ2 with "force 4K sector"
ZIL: none yet
L2ARC: 64GB SSD (Samsung PM800)
Mainboard: SuperMicro Z7DVL-E (all HDDs connected to internal S-ATA2 ports)

The error message i posted in the earlier post
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
Thanks for the quick response. First I'd tell you to remove the ZIL from your pool and then from your computer case. I have no idea what you are using this system for but your RAM is really too low and you need to bump it up to at least 6GB, but if you want great performance then just up to 16GB RAM as this allows high speed RAM to be the cache (LARC) and a ZIL just isn't really for most applications. This also goes for an L2ARC. I've done some testing and here is the link... http://forums.freenas.org/showthrea...Tek-NIC-Performance-Testing&p=58536#post58536 Don't let the name of the thread fool you, I have specific ZIL and L2ARC testing included.

I'm interested in the fact the SMART tests fail for the SSD. I'm sure FreeNAS has a bug which fails but I'd like to know more about how you are setting up the SMART test to be run for when you get the errors. Please assume I know nothing as it's the best way to give me all the information I might need to look into this.

Could you please post the results of the manual smartctl -a test for the SSD?
 

Thomymaster

Contributor
Joined
Apr 26, 2013
Messages
142
Code:
[root@freenas] ~# smartctl -a /dev/ada4                                         smartctl 5.43 2012-06-30 r3573 [FreeBSD 8.3-RELEASE-p7 amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG SSD PM830 mSATA 64GB
Serial Number:    S0XMNYAC201633
LU WWN Device Id: 5 002538 043584d30
Firmware Version: CXM12D1Q
User Capacity:    64,023,257,088 bytes [64.0 GB]
Sector Size:      512 bytes logical/physical
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4c
Local Time is:    Wed May  8 09:59:58 2013 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  300) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (   5) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       167
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       26
175 Program_Fail_Count_Chip 0x0032   100   100   010    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   010    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0013   099   099   010    Pre-fail  Always       -       3
178 Used_Rsvd_Blk_Cnt_Chip  0x0013   093   093   010    Pre-fail  Always       -       54
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   094   094   010    Pre-fail  Always       -       96
180 Unused_Rsvd_Blk_Cnt_Tot 0x0013   094   094   010    Pre-fail  Always       -       1536
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
195 Hardware_ECC_Recovered  0x001a   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   253   253   000    Old_age   Always       -       0
232 Available_Reservd_Space 0x0013   093   093   000    Pre-fail  Always       -       762
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       151165166
242 Total_LBAs_Read         0x0032   099   099   000    Old_age   Always       -       3859318

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
Bet that when you run the test manually you don't have a TEMPERATURE value.
If you notice joeschmuck was correct. Not sure how flexible the GUI is with this, but maybe create a seperate test for the SSD without checking for temperature if possible.

Nevermind. Looking closer it's exiting on (no Directive -d removeable). What does your smartd.conf look like:
Code:
cat /usr/local/etc/smartd.conf
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,996
The FreeNAS system should still let a SMART Test be run so it looks like there is an error there. However again, the ZIL is likely not going to help you at all but if you leave it in, if the ZIL dies you will know it and you will not loose any data if it fails with a ZFS V28 formatted drive pool.
 

Thomymaster

Contributor
Joined
Apr 26, 2013
Messages
142
Hi

I now changed the SSD to be the L2ARC instead of ZIL which makes more sence with ISCSI. Here isa my output of smartd.conf:

Code:
/dev/ada4 -n never -W 20,50,60 -m ***@***.de -s S/(01|02|03|04|05|06|07|08|09|10|11|12)/../(3)/(13)
/dev/ada0 -n never -W 20,50,60 -m ***@***.de -s S/(01|02|03|04|05|06|07|08|09|10|11|12)/../(3)/(13)
/dev/ada2 -n never -W 20,50,60 -m ***@***.de -s S/(01|02|03|04|05|06|07|08|09|10|11|12)/../(3)/(13)
/dev/ada1 -n never -W 20,50,60 -m ***@***.de -s S/(01|02|03|04|05|06|07|08|09|10|11|12)/../(3)/(13)
/dev/ada3 -n never -W 20,50,60 -m ***@***.de -s S/(01|02|03|04|05|06|07|08|09|10|11|12)/../(3)/(13)


/dev/ada4 is the SSD

Cheers

Thomy
 

titan_rw

Guru
Joined
Sep 1, 2012
Messages
586
If it's saying "unable to register device, no directive -d removable", I think that means it couldn't find the ssd at that time.

I'd be wondering exactly why, but you could try the following in "smart extra options" under "view disks - edit".

Code:
-a -W 0 -d removable


The -W 0 will get rid of the temperature warning. -a tells it to enable all the regular tests. The -d removable tell smart to run even if it can't open that particular device. Again, I'd be wondering why it can't open it at that time.

Also, why bother running self tests on an ssd? Do they actually do any good with flash media? Personally, I would disable them under "system - smart tests".
 
Status
Not open for further replies.
Top