Hi,
I need your help as I have a broken disk and want to understand what happened here.
First of all - I do a regular extended Smart testing on my disks. And then Send an email report. The scrips I use are hosed here: https://github.com/Spearfoot/FreeNAS-scripts/blob/master/smart_report.sh
The output of this last night was: (Just for the first 3 disks) are shown below. So i have several questions and maybe someone can help as I am really confused.
1) The error disk is ada1...Looking at the at the overall status report however everything's fine. Looking in the detailed section of ada1 it's also fine - on a first view.
It says no errors logged but looking at the CRC_Error_Count - it has more than 6K. So I guess this means the SDD is almost broken right?
If so - I should exchange this immediately I guess. It is a mirrored Volume so I guess I should simply shutdown, exchange and replace the disk (in the WEB GUI). Sorry for stupid question I had never this case before. The good thing everythings backuped so even if somethings going wrong it would not be a disaster.
2) Coming to this script - I am not sure but I think lot of people use this here. My question is...There should be a clear message in the header that one disk is broken. As I do not fully understand the script can someone help me to understand better.
Maybe everything is right and I just misunderstood the parameters.
3) What I also see in disk 3 is: "Status aborted by host". What does that mean?
Thanks a lot for help and support
S
I need your help as I have a broken disk and want to understand what happened here.
First of all - I do a regular extended Smart testing on my disks. And then Send an email report. The scrips I use are hosed here: https://github.com/Spearfoot/FreeNAS-scripts/blob/master/smart_report.sh
The output of this last night was: (Just for the first 3 disks) are shown below. So i have several questions and maybe someone can help as I am really confused.
1) The error disk is ada1...Looking at the at the overall status report however everything's fine. Looking in the detailed section of ada1 it's also fine - on a first view.
It says no errors logged but looking at the CRC_Error_Count - it has more than 6K. So I guess this means the SDD is almost broken right?
If so - I should exchange this immediately I guess. It is a mirrored Volume so I guess I should simply shutdown, exchange and replace the disk (in the WEB GUI). Sorry for stupid question I had never this case before. The good thing everythings backuped so even if somethings going wrong it would not be a disaster.
2) Coming to this script - I am not sure but I think lot of people use this here. My question is...There should be a clear message in the header that one disk is broken. As I do not fully understand the script can someone help me to understand better.
Maybe everything is right and I just misunderstood the parameters.
3) What I also see in disk 3 is: "Status aborted by host". What does that mean?
Thanks a lot for help and support
S
Code:
########## SMART status report summary for all drives on server NAS ########## +------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+ |Device|Serial |Temp|Power|Start|Spin |ReAlloc|Current|Offline |Seek |Total |High |Command|Last| | |Number | |On |Stop |Retry|Sectors|Pending|Uncorrec|Errors|Seeks |Fly |Timeout|Test| | | | |Hours|Count|Count| |Sectors|Sectors | | |Writes|Count |Age | +------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+ |ada0 |S252NXAG820276K | 34 |24303| | | 0| | | N/A| N/A| N/A| N/A| 1| |ada1 |S252NCAGA00376M | 34 |23966| | | 0| | | N/A| N/A| N/A| N/A| 1| |ada2 |WD-WCC4E4AH9899 | 29 |13896| 4797| 0| 0| 0| 0| N/A| N/A| N/A| N/A| 1| |ada3 ?|WD-WCC4E1XUF2Z4 | 29 |13879| 4225| 0| 0| 0| 0| N/A| N/A| N/A| N/A| 1| |ada4 ?|WD-WCC4E5NF9F5F | 28 |13892| 4860| 0| 0| 0| 0| N/A| N/A| N/A| N/A| 1| |ada5 ?|WD-WCC4E4RXR2E0 | 28 |13883| 4178| 0| 0| 0| 0| N/A| N/A| N/A| N/A| 1| +------+------------------+----+-----+-----+-----+-------+-------+--------+------+----------+------+-------+----+ ########## SMART status report for ada0 drive (Samsung based SSDs: S252NXAG820276K) ########## SMART overall-health self-assessment test result: PASSED ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 24303 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 200 177 Wear_Leveling_Count 0x0013 099 099 000 Pre-fail Always - 14 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 066 045 000 Old_age Always - 34 195 ECC_Error_Rate 0x001a 200 200 000 Old_age Always - 0 199 CRC_Error_Count 0x003e 099 099 000 Old_age Always - 1 235 POR_Recovery_Count 0x0012 099 099 000 Old_age Always - 156 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 13190459927 No Errors Logged Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Extended offline Completed without error 00% 24290 - ########## SMART status report for ada1 drive (Samsung based SSDs: S252NCAGA00376M) ########## SMART overall-health self-assessment test result: PASSED ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 23966 12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 197 177 Wear_Leveling_Count 0x0013 099 099 000 Pre-fail Always - 13 179 Used_Rsvd_Blk_Cnt_Tot 0x0013 100 100 010 Pre-fail Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 010 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 010 Old_age Always - 0 183 Runtime_Bad_Block 0x0013 100 100 010 Pre-fail Always - 0 187 Uncorrectable_Error_Cnt 0x0032 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0032 066 046 000 Old_age Always - 34 195 ECC_Error_Rate 0x001a 200 200 000 Old_age Always - 0 199 CRC_Error_Count 0x003e 093 093 000 Old_age Always - 6076 235 POR_Recovery_Count 0x0012 099 099 000 Old_age Always - 152 241 Total_LBAs_Written 0x0032 099 099 000 Old_age Always - 12669998984 No Errors Logged Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Extended offline Interrupted (host reset) 00% 23952 - ########## SMART status report for ada2 drive (Western Digital Red: WD-WCC4E4AH9899) ########## SMART overall-health self-assessment test result: PASSED ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 183 179 021 Pre-fail Always - 7841 4 Start_Stop_Count 0x0032 096 096 000 Old_age Always - 4797 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 081 081 000 Old_age Always - 13896 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 123 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 50 193 Load_Cycle_Count 0x0032 197 197 000 Old_age Always - 11205 194 Temperature_Celsius 0x0022 123 099 000 Old_age Always - 29 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 No Errors Logged Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Extended offline Aborted by host 90% 13883 -