Auto mount on reboot?

tfran1990

Patron
Joined
Oct 18, 2017
Messages
294
***version 11.1U7*****

I have a stripe pool with 2 discs.(the data on this pool has no importance,its the FTP for my IP cameras)
for the past week i have been getting
Code:
 pool: STRIPE
state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0 days 00:12:49 with 2 errors on Mon Dec  2 17:16:59 2019
config:

        NAME                                          STATE     READ WRITE CKSUM
        STRIPE                                        ONLINE       4     0     0
          gptid/ffc0194e-9575-11e9-89a2-e0071bfffaff  ONLINE       0     0     0
          gptid/02986c3c-9576-11e9-89a2-e0071bfffaff  ONLINE       4     0     4


Should I destroy the pool and run a full 0 pass to remap the bad sectors and rebuild the pool?
Or unmount the pool then do an all 0 wipe on the disc to try to keep the pool intact?

If I take option 2 and use the Unmount command after wiping the disc; Will it be able to remount on reboot?

Code:
root@freenas:~ # zfs list

STRIPE                                                 157G  1.60T   157G  /mnt/STRIPE


Code:
root@freenas:~ # zfs list -r /mnt/STRIPE
NAME     USED  AVAIL  REFER  MOUNTPOINT
STRIPE   157G  1.60T   157G  /mnt/STRIPE


Is there another way to fix bad sectors on a disc?
Any feedback would be helpful.
 
Last edited:

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,079

tfran1990

Patron
Joined
Oct 18, 2017
Messages
294
smart output is
Code:
root@freenas:~ # smartctl -a /dev/da4
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.11
Device Model:     ST31000333AS
Serial Number:    9TE1CY5D
LU WWN Device Id: 5 000c50 010dd614b
Firmware Version: CC3H
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sat Dec 14 19:35:49 2019 CST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121) The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline
data collection:                (  617) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 207) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x103f) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   108   099   006    Pre-fail  Always       -       129797493
  3 Spin_Up_Time            0x0003   099   093   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       122
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       1
  7 Seek_Error_Rate         0x000f   071   060   030    Pre-fail  Always       -       13894759
  9 Power_On_Hours          0x0032   096   096   000    Old_age   Always       -       4277
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   037   020    Old_age   Always       -       63
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   083   083   000    Old_age   Always       -       17
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       5
189 High_Fly_Writes         0x003a   070   070   000    Old_age   Always       -       30
190 Airflow_Temperature_Cel 0x0022   069   050   045    Old_age   Always       -       31 (Min/Max 28/35)
194 Temperature_Celsius     0x0022   031   050   000    Old_age   Always       -       31 (0 18 0 0 0)
195 Hardware_ECC_Recovered  0x001a   035   021   000    Old_age   Always       -       129797493
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       3
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       4276 (173 81 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       1308253546
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1977398892

SMART Error Log Version: 1
ATA Error Count: 13 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 13 occurred at disk power-on lifetime: 3986 hours (166 days + 2 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: WP at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 10 ff ff ff 4f 00  28d+11:58:21.643  WRITE FPDMA QUEUED
  61 00 10 ff ff ff 4f 00  28d+11:58:21.642  WRITE FPDMA QUEUED
  60 00 10 90 02 40 40 00  28d+11:58:21.616  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:21.616  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:21.616  READ FPDMA QUEUED

Error 12 occurred at disk power-on lifetime: 3986 hours (166 days + 2 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  28d+11:58:18.647  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:18.641  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:18.637  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:18.617  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00  28d+11:58:18.617  READ FPDMA QUEUED

Error 11 occurred at disk power-on lifetime: 3986 hours (166 days + 2 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  28d+11:58:15.638  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:15.616  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:15.614  READ FPDMA QUEUED
  ef 02 00 00 00 00 00 00  28d+11:58:15.537  SET FEATURES [Enable write cache]
  ef aa 00 00 00 00 00 00  28d+11:58:15.517  SET FEATURES [Enable read look-ahead]

Error 10 occurred at disk power-on lifetime: 3986 hours (166 days + 2 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  28d+11:58:12.563  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:12.550  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:12.548  READ FPDMA QUEUED
  61 00 10 90 02 40 40 00  28d+11:58:12.540  WRITE FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:12.518  READ FPDMA QUEUED

Error 9 occurred at disk power-on lifetime: 3986 hours (166 days + 2 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  28d+11:58:09.621  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00  28d+11:58:09.620  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00  28d+11:58:09.620  READ FPDMA QUEUED
  60 00 10 90 02 40 40 00  28d+11:58:09.619  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  28d+11:58:09.619  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      4036         1206178450
# 2  Extended offline    Completed: read failure       50%      3919         1206178450
# 3  Conveyance offline  Completed without error       00%      3916         -
# 4  Extended offline    Completed: read failure       50%      3911         1206178450
# 5  Short offline       Completed without error       00%      3910         -
# 6  Extended offline    Completed without error       00%      3187         -
# 7  Extended offline    Completed without error       00%      3172         -
# 8  Extended offline    Completed: read failure       40%      3138         1308702237
# 9  Extended offline    Completed without error       00%      1906         -
#10  Extended offline    Completed without error       00%       352         -
#11  Short offline       Completed without error       00%       349         -
#12  Short offline       Completed without error       00%       242         -
#13  Extended offline    Completed without error       00%        78         -
1 of 4 failed self-tests are outdated by newer successful extended offline self-test # 6

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

tfran1990

Patron
Joined
Oct 18, 2017
Messages
294
So ID 5,197,198 are less then 5, and there no 199 ID errors. This is a failing HDD.
This disc has not totally shit the bed yet,its not a huge deal to to remap 3 sectors right?
Correct me if im wrong.
 
Last edited:

tfran1990

Patron
Joined
Oct 18, 2017
Messages
294
Its been a couple weeks, sector count is still the same. Is it possible that the disc will remap sectors and be fine as a video dump for my FTP?
Any thoughts on the matter?
 

tfran1990

Patron
Joined
Oct 18, 2017
Messages
294
after a reboot i am getting
Code:
 pool: STRIPE
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0 days 00:12:49 with 2 errors on Mon Dec  2 17:16:59 2019
config:

        NAME                                          STATE     READ WRITE CKSUM
        STRIPE                                        ONLINE       0     0     0
          gptid/ffc0194e-9575-11e9-89a2-e0071bfffaff  ONLINE       0     0     0
          gptid/02986c3c-9576-11e9-89a2-e0071bfffaff  ONLINE       0     0     0


could it be that the disc has healed?
Also could i do a replace on the problem disc within the UI?
 
Top