Update 10.07.2019
For everyone who do use the search and does have the same problem - here´s a tl.dr
For detailed information about how to do it check the user guide.
1: I got problems reading my files, long lags accessing my drive.
2: Checked FreeNas: It told me about unreadable sectors on one of my drives (ada4).
- You should setup regular smart tests with email alerts to recognize that early on.
3: Ordered a new HDD (same size, same manufacturer)
4: Replaced the broken HDD
5: Resilvered (took 13h for my 3tb WD RED - 70 % full)
6: Running smooth and fine since that.
I am using a ZFS pool with Raid z2.
Hello,
I hope someone can help me with some advice. I did setup my freenas with a ZFS raidz2 pool 4 years ago and it was running smooth since yesterday. I am using 8 WD RED 3tb HDDs. Today I recognized heavy read delays on my server while accessing some files. After logging in I first checked the console and there´s an error:
I also run the smartctl on every drive.
ada1 is a small SSD for freenas and as cache.
ada2
ada3
ada4
ada5
ada6
ada7
ada8
ada9
My ZFS status
My disc/ volume setup. All healthy here.
I am a little confused about read / write is all 0 here?
Can someone please help me with some advice. Should I replace all pre-fail HDDs or only ada4? If I replace ada4 and I do have some serious read problems accessing my files - will ZFS resilver for good or will some data be lost?
Thanks and regards,
Markus
For everyone who do use the search and does have the same problem - here´s a tl.dr
For detailed information about how to do it check the user guide.
1: I got problems reading my files, long lags accessing my drive.
2: Checked FreeNas: It told me about unreadable sectors on one of my drives (ada4).
- You should setup regular smart tests with email alerts to recognize that early on.
3: Ordered a new HDD (same size, same manufacturer)
4: Replaced the broken HDD
5: Resilvered (took 13h for my 3tb WD RED - 70 % full)
6: Running smooth and fine since that.
I am using a ZFS pool with Raid z2.
Hello,
I hope someone can help me with some advice. I did setup my freenas with a ZFS raidz2 pool 4 years ago and it was running smooth since yesterday. I am using 8 WD RED 3tb HDDs. Today I recognized heavy read delays on my server while accessing some files. After logging in I first checked the console and there´s an error:
Code:
Jul 3 21:30:36 freenas smartd[2778]: Device: /dev/ada4, 31 Currently unreadable (pending) sectors Jul 3 21:30:36 freenas smartd[2778]: Warning via /usr/local/www/freenasUI/tools/smart_alert.py to mymail@hoster.de produced unexpected output (167 bytes) to STDOUT/STDERR: Jul 3 21:30:36 freenas smartd[2778]: usage: smart_alert.py [-h] [-d DEV] Jul 3 21:30:36 freenas smartd[2778]: smart_alert.py: error: unrecognized arguments: -s SMART error (CurrentPendingSector) detected on host: freenas markus@angelmahr.de Jul 3 21:30:36 freenas smartd[2778]: Warning via /usr/local/www/freenasUI/tools/smart_alert.py to mymail@hoster.de: failed (32-bit/8-bit exit status: 512/2)
I also run the smartctl on every drive.
ada1 is a small SSD for freenas and as cache.
ada2
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 181 176 021 Pre-fail Always - 5950 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 867 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11338 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 852 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 88 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 832 194 Temperature_Celsius 0x0022 117 112 000 Old_age Always - 33 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada3
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 181 177 021 Pre-fail Always - 5908 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 789 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11206 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 784 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 52 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 745 194 Temperature_Celsius 0x0022 116 110 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada4
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 5868 3 Spin_Up_Time 0x0027 179 175 021 Pre-fail Always - 6041 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 816 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11231 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 811 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 75 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 772 194 Temperature_Celsius 0x0022 116 113 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 31 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada5
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 174 168 021 Pre-fail Always - 6291 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 815 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11231 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 810 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 74 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 773 194 Temperature_Celsius 0x0022 116 112 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada6
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 184 179 021 Pre-fail Always - 5783 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 814 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11232 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 811 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 73 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 772 194 Temperature_Celsius 0x0022 116 112 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada7
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 176 170 021 Pre-fail Always - 6200 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 810 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11232 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 810 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 69 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 773 194 Temperature_Celsius 0x0022 116 111 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada8
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 176 170 021 Pre-fail Always - 6200 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 810 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11232 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 810 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 69 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 773 194 Temperature_Celsius 0x0022 116 111 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
ada9
Code:
SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 182 177 021 Pre-fail Always - 5866 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 784 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 085 085 000 Old_age Always - 11206 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 784 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 47 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 745 194 Temperature_Celsius 0x0022 115 110 000 Old_age Always - 35 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged
My ZFS status
My disc/ volume setup. All healthy here.
I am a little confused about read / write is all 0 here?
Can someone please help me with some advice. Should I replace all pre-fail HDDs or only ada4? If I replace ada4 and I do have some serious read problems accessing my files - will ZFS resilver for good or will some data be lost?
Thanks and regards,
Markus
Attachments
Last edited: