Question about SMARTCTL and RAID CARD

Status
Not open for further replies.

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
Now that you have it working, you'll want to run some short and long tests on the drives and schedule them to run in the future.

Alright! Thanks for the reminder, I just set up a "long test" to run of every week end for every drive!

If I run long tests, do I still need to run short tests ?

If yes, how often ?
 
Last edited:

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
Some users schedule the long tests for twice a month. On alternate weeks, they run a scrub. And run short tests once a week.
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
It's not the version of sas2flash.efi that matters it is the firmware you flashed onto the card. If you used the link in that post you may have p14, if you used the firmware with sas2flash (p20)... you will have p20. None of those are acceptable (both will show your drives).

I do have that card flashed as a 9211-8, it performs perfectly as any lsi sas2008 should. Do take the time to make sure you have p16 as mentioned earlier. It only takes a few minutes and could save you trouble down the line.
 

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
It's not the version of sas2flash.efi that matters it is the firmware you flashed onto the card. If you used the link in that post you may have p14, if you used the firmware with sas2flash (p20)... you will have p20. None of those are acceptable (both will show your drives).

I do have that card flashed as a 9211-8, it performs perfectly as any lsi sas2008 should. Do take the time to make sure you have p16 as mentioned earlier. It only takes a few minutes and could save you trouble down the line.

Okay thanks for pointing that out!

That's exactly what I did: used the firmware from the link (old one) with a recent installer (P20)

So where should I get the right firmware?
On LSI website? What should I look for?

If I research for 9211-8 I get various results, and two that look like this :
Zip Compressed File
9211_8i_Package_P20_IR_IT_Firmware_BIOS_for_MSDOS_Windows
9211_4i_Package_P20_IR_IT_Firmware_BIOS_for_MSDOS_Windows

I don't even understand what's the 8i and 4i stand for...
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
The 9211_8 vs 9211_4 are the different model numbers for the LSI controllers. In this case we are crossflashing the card to the 9211-8i so must use that firmware.

Here is the link to the p16 version. It's buried a little. Upgrading drivers is nothing. You won't need to erase or re-input the SAS address. Just flash the new rom and you're done. If you were down-grading you'd have to force it.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
But why would you use P16 instead of P20 ?
Because FreeNAS uses the P16 driver and it is expected (including by LSI themselves) that you use matched drivers and firmware.
 

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
Ok so I should get the P16 firmware and SAS2FLASH.EFI P16 installer

Then run these steps only:

10. Set controller to 6GB/s mode:
sas2flsh -o -e 6
11. This will then ask for a firmware version, this is the same name as the updated .bin file:
2118it.bin
12. Flash the controller to new firmware with IT-mode:
sas2flsh -o -f 2118it.bin -b mptsas2.rom
13. Program SAS address in IT-mode where 500605b********* is the code on the green sticker on your RAID card without the "-":
sas2flsh -o -sasadd 500605b*********

right?
 

mjws00

Guru
Joined
Jul 25, 2014
Messages
798
Here are my notes from the last one:

Use a usb stick with the sas2flash.efi utility. and the p16 firmware.
Select efi shell from boot menu.
change to >fs0:
>sas2flash -o -f 2118IT.BIN ** we don't need boot rom.
>sas2flash -list **should display info. didn't need to re-add sas address as this was not a crossflash just an update.

** are just my comments. You really only need 'sas2flash -o 2118IT.BIN' for a forward upgrade.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Starting with FreeNAS 9.3 the OS will give you a stoplight warning if the driver and firmware don't match. ;)
 

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
I just noticed this:

For the 4 drives that are plugged in my newly flashed RAID card, when I run SMARTCTL -A on them, it returns me (amongst other things)

Device is: Not in smartctl database

I'm wondering why is that, and how to correct this?

Also, I noticed that the scheduled long test don't seem to run on those 4 drives, which is a problem !!

Here's my SMART tests schedule : http://cl.ly/image/0C1F0f0D0r0B
Normally they should run on every first saturday and sunday of each month
(FYI : before it was every week, so since the date of flashing the RAID card, the tests should have been ran twice)

Output of one of the drives:

Code:
smartctl -a /dev/da0
=== START OF INFORMATION SECTION ===
Device Model:     WDC WD4003FZEX-00Z4SA0
Serial Number:    WD-WCC130832923
LU WWN Device Id: 5 0014ee 25e09ddae
Firmware Version: 01.01A01
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Nov 27 21:23:28 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (44040) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 476) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x7035)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   159   133   021    Pre-fail  Always       -       11041
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       51
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       5295
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       51
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       50
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   112   111   000    Old_age   Always       -       40
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I just noticed this:

Code:
Device is:        Not in smartctl database [for details use: -P showall]
Code:

Irrelevant. Doesn't affect your well-being in any way. It just means the smartmontools folks hadn't included that drive in the database that's included with FreeNAS' version of smartctl. Everything will still work correctly.
[/QUOTE]
 

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
Irrelevant. Doesn't affect your well-being in any way. It just means the smartmontools folks hadn't included that drive in the database that's included with FreeNAS' version of smartctl. Everything will still work correctly.

But then why the scheduled SMART test (long) don't run or don't show up when I run SMARTCTL -a on those drives ?
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
But then why the scheduled SMART test (long) don't run or don't show up when I run SMARTCTL -a on those drives ?

You're right, I missed that.

Have you confirmed that they're properly set up? The drive says it's never run any SMART tests at all. What happens if you manually start one?
 

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
You're right, I missed that.

Have you confirmed that they're properly set up? The drive says it's never run any SMART tests at all. What happens if you manually start one?

To my knowledge the drives and the scheduled tests are set up properly...

Starting one manually did work.

I ran

Code:
smartctl --test=long /dev/da0


Came back later and pulled out the results:

Code:
smartctl -a /dev/da0
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p15 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD4003FZEX-00Z4SA0
Serial Number:    WD-WCC130832923
LU WWN Device Id: 5 0014ee 25e09ddae
Firmware Version: 01.01A01
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Dec  2 11:33:13 2014 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)    Offline data collection activity
                    was suspended by an interrupting command from host.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (44040) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 476) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x7035)    SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   159   133   021    Pre-fail  Always       -       11041
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       51
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   093   093   000    Old_age   Always       -       5405
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       51
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       50
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   112   111   000    Old_age   Always       -       40
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      5323         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Manually starting a test does work.

I'm still clueless why my scheduled tests won't work :S

COULD IT have something to do with the Not in smartctl database thing?
 

danb35

Hall of Famer
Joined
Aug 16, 2011
Messages
15,504
No, it has nothing to do with the "not in smartctl database" message.

In the test setup, have you selected your drives? Up at the top, under Disks:, they should all be highlighted.

Try setting the long test to run weekly. Set "each selected hour" to whatever hour you want it to start, "every N day of month" to 1, check all the months, and then check the one day you want the test to run. See if that works. If it does, then you'll have a known-working baseline to work from.
 
Last edited:

BlazeStar

Patron
Joined
Apr 6, 2014
Messages
383
So I deleted all my tasks and set up new ones, now it works.

Just something I noticed though:

Code:
smartctl --scan
/dev/da0 -d scsi # /dev/da0, SCSI device
/dev/da1 -d scsi # /dev/da1, SCSI device
/dev/da2 -d scsi # /dev/da2, SCSI device
/dev/da3 -d scsi # /dev/da3, SCSI device


All those 4 drives are ATA drives... but it seems the flashed RAID card shows them as SCSI!

Can there be any problem with that ?
 
Status
Not open for further replies.
Top