ZPOOL Dedegraded - S.M.A.R.T is ok

Pliqui · Jan 5, 2019

This morning I received alert from cron stating that my zpool is in a degrade state.

The only recent activity was to upgrade to 11.1U6

System:

CPU: Intel Xeon E3-1230 V6 Kaby Lake 3.5 GHz - Changed from E3-1225 V6
Mobo: SUPERMICRO MBD-X11SSL-CF Micro ATX*- Changed from SUPERMICRO MBD-X11SSH-F-O Micro ATX
RAM: 4 x Crucial CT16G4WFD824A 16Gb Ddr4 Ecc Unbuff Cl17 (64Gb maxed out)
Case: Fractal Design Node 804
PSU: Seasonic FOCUS Plus Series SSR-650FX 650W
Boot: 1 x SSD 128GB
Pool: stripped-mirror 6 Disks (4 x 8TB and 2 x 10TB)
SAS Cables: 2 x Supermicro MiniSAS HD to 4x SATA

Code:

[root@freenas ~]# lspci
00:00.0 Host bridge: Intel Corporation Device 5918 (rev 05)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x16) (rev 05)
00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor PCIe Controller (x8) (rev 05)
00:13.0 Non-VGA unclassified device: Intel Corporation Sunrise Point-H Integrated Sensor Hub (rev 31)
00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31)
00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31)
00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31)
00:17.0 SATA controller: Intel Corporation Sunrise Point-H SATA controller [AHCI mode] (rev 31)
00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #5 (rev f1)
00:1c.5 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #6 (rev f1)
00:1c.6 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #7 (rev f1)
00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #9 (rev f1)
00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31)
00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31)
00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31)
02:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02)
03:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
04:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
05:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 03)
06:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)

This one of the emails i got

Code:

Device: /dev/da0 [SAT], failed to read SMART Attribute Data
Device: /dev/da0 [SAT], not capable of SMART self-check
Device: /dev/da0 [SAT], Read SMART Self-Test Log Failed
The volume HOME state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.
Device: /dev/da0 [SAT], Read SMART Error Log Failed

ZPOOL email at 3:00 am

Code:

Checking status of zfs pools:
NAME           SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
HOME          23.6T  12.1T  11.5T         -     8%    51%  1.00x  DEGRADED  /mnt
freenas-boot   118G  3.30G   115G         -      -     2%  1.00x  ONLINE  -

  pool: HOME
state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
    repaired.
  scan: scrub repaired 0 in 0 days 11:06:22 with 0 errors on Tue Jan  1 13:06:22 2019
config:

    NAME                                            STATE     READ WRITE CKSUM
    HOME                                            DEGRADED     0     0     0
      mirror-0                                      ONLINE       0     0     0
        gptid/df322087-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
        gptid/dfb5acdc-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
      mirror-1                                      ONLINE       0     0     0
        gptid/e03c826e-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
        gptid/e0c5c181-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
      mirror-2                                      DEGRADED     0     0     0
        gptid/d80c60dc-d264-11e8-afaf-ac1f6b83f450  FAULTED      6     4     0  too many errors
        gptid/d88326db-d264-11e8-afaf-ac1f6b83f450  ONLINE       0     0     0

errors: No known data errors

and this is the status now

Code:

[root@freenas ~]# zpool status -v
  pool: HOME
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 0 days 11:06:22 with 0 errors on Tue Jan  1 13:06:22 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        HOME                                            DEGRADED     0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/df322087-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
            gptid/dfb5acdc-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/e03c826e-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
            gptid/e0c5c181-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
          mirror-2                                      DEGRADED     0     0     0
            gptid/d80c60dc-d264-11e8-afaf-ac1f6b83f450  DEGRADED     0     0   413  too many errors
            gptid/d88326db-d264-11e8-afaf-ac1f6b83f450  ONLINE       0     0     0

errors: No known data errors

But I checked the SMART and did not find anything out of the ordinary for the disk with the problem DA0 (the usual culprits IDs 5, 197, and 198 )

Code:

[root@freenas ~]# smartctl -x /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HDN721010ALE604
Serial Number:    1DGS64JZ
LU WWN Device Id: 5 000cca 26cca8b9b
Firmware Version: 83XN
User Capacity:    10,000,831,348,736 bytes [10.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan  5 13:48:56 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 249) Self-test routine in progress...
                                        90% of test remaining.
Total time to complete Offline
data collection:                (   93) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (1304) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  --S---   134   134   054    -    96
  3 Spin_Up_Time            POS---   100   100   024    -    0
  4 Start_Stop_Count        -O--C-   100   100   000    -    2
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         -O-R--   100   100   067    -    0
  8 Seek_Time_Performance   --S---   128   128   020    -    18
  9 Power_On_Hours          -O--C-   100   100   000    -    1915
10 Spin_Retry_Count        -O--C-   100   100   060    -    0
12 Power_Cycle_Count       -O--CK   100   100   000    -    2
22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    79
193 Load_Cycle_Count        -O--C-   100   100   000    -    79
194 Temperature_Celsius     -O----   162   162   000    -    37 (Min/Max 25/40)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x04       GPL     R/O    256  Device Statistics log
0x04       SL      R/O    255  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x08       GPL     R/O      2  Power Conditions log
0x09           SL  R/W      1  Selective self-test log
0x0c       GPL     R/O   5501  Pending Defects log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x12       GPL     R/O      1  SATA NCQ Non-Data log
0x13       GPL     R/O      1  SATA NCQ Send and Receive log
0x15       GPL     R/W      1  Rebuild Assist log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x24       GPL     R/O    256  Current Device Internal Status Data log
0x25       GPL     R/O    256  Saved Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Aborted by host               90%      1914         -
# 2  Short offline       Completed without error       00%      1855         -
# 3  Short offline       Completed without error       00%      1807         -
# 4  Short offline       Completed without error       00%      1783         -
# 5  Short offline       Completed without error       00%      1735         -
# 6  Short offline       Completed without error       00%      1687         -
# 7  Short offline       Completed without error       00%      1638         -
# 8  Short offline       Completed without error       00%      1590         -
# 9  Extended offline    Completed without error       00%      1588         -
#10  Short offline       Completed without error       00%      1542         -
#11  Short offline       Completed without error       00%      1494         -
#12  Short offline       Completed without error       00%      1446         -
#13  Short offline       Completed without error       00%      1398         -
#14  Short offline       Completed without error       00%      1350         -
#15  Short offline       Completed without error       00%      1302         -
#16  Short offline       Completed without error       00%      1254         -
#17  Extended offline    Completed without error       00%      1252         -
#18  Short offline       Completed without error       00%      1206         -
#19  Short offline       Completed without error       00%      1158         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       256 (0x0100)
SCT Support Level:                   0
Device State:                        DST executing in background (3)
Current Temperature:                    37 Celsius
Power Cycle Min/Max Temperature:     27/38 Celsius
Lifetime    Min/Max Temperature:     25/40 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -40/70 Celsius
Temperature History Size (Index):    128 (37)

Index    Estimated Time   Temperature Celsius
  38    2019-01-05 11:41    35  ****************
...    ..( 29 skipped).    ..  ****************
  68    2019-01-05 12:11    35  ****************
  69    2019-01-05 12:12    36  *****************
  70    2019-01-05 12:13    35  ****************
...    ..( 27 skipped).    ..  ****************
  98    2019-01-05 12:41    35  ****************
  99    2019-01-05 12:42    36  *****************
...    ..( 14 skipped).    ..  *****************
114    2019-01-05 12:57    36  *****************
115    2019-01-05 12:58    35  ****************
116    2019-01-05 12:59    36  *****************
117    2019-01-05 13:00    35  ****************
118    2019-01-05 13:01    35  ****************
119    2019-01-05 13:02    36  *****************
...    ..( 40 skipped).    ..  *****************
  32    2019-01-05 13:43    36  *****************
  33    2019-01-05 13:44    37  ******************
...    ..(  2 skipped).    ..  ******************
  36    2019-01-05 13:47    37  ******************
  37    2019-01-05 13:48    35  ****************

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) supported [please try: '-l defects']

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2          188  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2          189  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS

The only values that are different from other disks are:

0x0009 2 188 Transition from drive PhyRdy to drive PhyNRdy
0x000a 2 189 Device-to-host register FISes sent due to a COMRESET

I'm running a long test on the drive.

I also running schedule tests

SMART:
Long: at 2:00 AM on 08 and 22 of every month all days of the week
Short: at 1:00 AM every 2 days of every month all days of the week

SCRUB:
HOME pool: Threshold days 10, at 2:00 am, on day 1 and 15 of every month all days of the week

Any input will be appreciated.

Thanks

Pliqui · Jan 5, 2019

Just to compare

SMART of disk DA1

Code:

[root@freenas /var/log]# smartctl -x /dev/da1
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HDN721010ALE604
Serial Number:    1DGP7V3Z
LU WWN Device Id: 5 000cca 26cc9a93b
Firmware Version: 83XN
User Capacity:    10,000,831,348,736 bytes [10.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan  5 12:29:57 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (   93) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (1165) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  --S---   134   134   054    -    96
  3 Spin_Up_Time            POS---   100   100   024    -    0
  4 Start_Stop_Count        -O--C-   100   100   000    -    2
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         -O-R--   100   100   067    -    0
  8 Seek_Time_Performance   --S---   128   128   020    -    18
  9 Power_On_Hours          -O--C-   100   100   000    -    1914
 10 Spin_Retry_Count        -O--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    2
 22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    81
193 Load_Cycle_Count        -O--C-   100   100   000    -    81
194 Temperature_Celsius     -O----   166   166   000    -    36 (Min/Max 25/41)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x04       GPL     R/O    256  Device Statistics log
0x04       SL      R/O    255  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x08       GPL     R/O      2  Power Conditions log
0x09           SL  R/W      1  Selective self-test log
0x0c       GPL     R/O   5501  Pending Defects log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x12       GPL     R/O      1  SATA NCQ Non-Data log
0x13       GPL     R/O      1  SATA NCQ Send and Receive log
0x15       GPL     R/W      1  Rebuild Assist log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x24       GPL     R/O    256  Current Device Internal Status Data log
0x25       GPL     R/O    256  Saved Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      1903         -
# 2  Short offline       Completed without error       00%      1855         -
# 3  Short offline       Completed without error       00%      1807         -
# 4  Short offline       Completed without error       00%      1783         -
# 5  Short offline       Completed without error       00%      1735         -
# 6  Short offline       Completed without error       00%      1687         -
# 7  Short offline       Completed without error       00%      1638         -
# 8  Short offline       Completed without error       00%      1590         -
# 9  Extended offline    Completed without error       00%      1586         -
#10  Short offline       Completed without error       00%      1542         -
#11  Short offline       Completed without error       00%      1494         -
#12  Short offline       Completed without error       00%      1446         -
#13  Short offline       Completed without error       00%      1398         -
#14  Short offline       Completed without error       00%      1350         -
#15  Short offline       Completed without error       00%      1302         -
#16  Short offline       Completed without error       00%      1254         -
#17  Extended offline    Completed without error       00%      1250         -
#18  Short offline       Completed without error       00%      1206         -
#19  Short offline       Completed without error       00%      1158         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       256 (0x0100)
SCT Support Level:                   0
Device State:                        Active (0)
Current Temperature:                    36 Celsius
Power Cycle Min/Max Temperature:     28/39 Celsius
Lifetime    Min/Max Temperature:     25/41 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -40/70 Celsius
Temperature History Size (Index):    128 (82)

Index    Estimated Time   Temperature Celsius
                                                
                                                
                                                
  83    2019-01-05 10:22    36  *****************
                                                
 ...    ..(126 skipped).    ..  *****************
                                                
                                                
                                                
  82    2019-01-05 12:29    36  *****************
                                                
                                                
                                                
                                                
                                                
                                                
                                                
                                                  
                                                  
                                                  
                                                

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) supported [please try: '-l defects']

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2           24  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2           25  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS

and SMART of disk DA3

Code:

[root@freenas /var/log]# smartctl -x /dev/da3
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HDN728080ALE604
Serial Number:    R6GRZE8Y
LU WWN Device Id: 5 000cca 263ca7263
Firmware Version: A4GNW91X
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan  5 12:51:41 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  101) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (1142) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  P-S---   133   133   054    -    108
  3 Spin_Up_Time            POS---   154   154   024    -    406 (Average 450)
  4 Start_Stop_Count        -O--C-   100   100   000    -    21
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         PO-R--   100   100   067    -    0
  8 Seek_Time_Performance   P-S---   128   128   020    -    18
  9 Power_On_Hours          -O--C-   100   100   000    -    4789
 10 Spin_Retry_Count        PO--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    21
 22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    216
193 Load_Cycle_Count        -O--C-   100   100   000    -    216
194 Temperature_Celsius     -O----   146   146   000    -    41 (Min/Max 25/49)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x04       GPL,SL  R/O      8  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x08       GPL     R/O      2  Power Conditions log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x12       GPL     R/O      1  SATA NCQ Non-Data log
0x15       GPL,SL  R/W      1  Rebuild Assist log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x24       GPL     R/O    256  Current Device Internal Status Data log
0x25       GPL     R/O    256  Saved Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      4778         -
# 2  Short offline       Completed without error       00%      4730         -
# 3  Short offline       Completed without error       00%      4682         -
# 4  Short offline       Completed without error       00%      4658         -
# 5  Short offline       Completed without error       00%      4610         -
# 6  Short offline       Completed without error       00%      4562         -
# 7  Short offline       Completed without error       00%      4514         -
# 8  Short offline       Completed without error       00%      4466         -
# 9  Extended offline    Completed without error       00%      4461         -
#10  Short offline       Completed without error       00%      4418         -
#11  Short offline       Completed without error       00%      4370         -
#12  Short offline       Completed without error       00%      4322         -
#13  Short offline       Completed without error       00%      4274         -
#14  Short offline       Completed without error       00%      4226         -
#15  Short offline       Completed without error       00%      4178         -
#16  Short offline       Completed without error       00%      4130         -
#17  Extended offline    Completed without error       00%      4125         -
#18  Short offline       Completed without error       00%      4082         -
#19  Short offline       Completed without error       00%      4034         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       256 (0x0100)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    41 Celsius
Power Cycle Min/Max Temperature:     32/46 Celsius
Lifetime    Min/Max Temperature:     25/49 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -40/70 Celsius
Temperature History Size (Index):    128 (116)

Index    Estimated Time   Temperature Celsius
 117    2019-01-05 10:44    40  *********************
 ...    ..( 98 skipped).    ..  *********************
  88    2019-01-05 12:23    40  *********************
  89    2019-01-05 12:24    41  **********************
  90    2019-01-05 12:25    40  *********************
  91    2019-01-05 12:26    41  **********************
  92    2019-01-05 12:27    40  *********************
  93    2019-01-05 12:28    41  **********************
 ...    ..(  5 skipped).    ..  **********************
  99    2019-01-05 12:34    41  **********************
 100    2019-01-05 12:35    40  *********************
 101    2019-01-05 12:36    41  **********************
 ...    ..( 13 skipped).    ..  **********************
 115    2019-01-05 12:50    41  **********************
 116    2019-01-05 12:51    40  *********************

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 2) ==
0x01  0x008  4              21  ---  Lifetime Power-On Resets
0x01  0x018  6     51003204676  ---  Logical Sectors Written
0x01  0x020  6       275245429  ---  Number of Write Commands
0x01  0x028  6    153605950605  ---  Logical Sectors Read
0x01  0x030  6       772397534  ---  Number of Read Commands
0x01  0x038  6     17243195800  ---  Date and Time TimeStamp
0x03  =====  =               =  ===  == Rotating Media Statistics (rev 1) ==
0x03  0x008  4            4739  ---  Spindle Motor Power-on Hours
0x03  0x010  4            4739  ---  Head Flying Hours
0x03  0x018  4             216  ---  Head Load Events
0x03  0x020  4               0  ---  Number of Reallocated Logical Sectors
0x03  0x028  4          748796  ---  Read Recovery Attempts
0x03  0x030  4               0  ---  Number of Mechanical Start Failures
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4               0  ---  Resets Between Cmd Acceptance and Completion
0x05  =====  =               =  ===  == Temperature Statistics (rev 1) ==
0x05  0x008  1              41  ---  Current Temperature
0x05  0x010  1              40  N--  Average Short Term Temperature
0x05  0x018  1              37  N--  Average Long Term Temperature
0x05  0x020  1              49  ---  Highest Temperature
0x05  0x028  1              25  ---  Lowest Temperature
0x05  0x030  1              47  N--  Highest Average Short Term Temperature
0x05  0x038  1              25  N--  Lowest Average Short Term Temperature
0x05  0x040  1              43  N--  Highest Average Long Term Temperature
0x05  0x048  1              25  N--  Lowest Average Long Term Temperature
0x05  0x050  4               0  ---  Time in Over-Temperature
0x05  0x058  1              60  ---  Specified Maximum Operating Temperature
0x05  0x060  4               0  ---  Time in Under-Temperature
0x05  0x068  1               0  ---  Specified Minimum Operating Temperature
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4              26  ---  Number of Hardware Resets
0x06  0x010  4             169  ---  Number of ASR Events
0x06  0x018  4               0  ---  Number of Interface CRC Errors
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2           25  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2           26  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS

Chris Moore · Jan 5, 2019

You recently disassembled the system to change out the system board. I would think this might be caused by a loose connection on the associated drive data cable.
You could try shutdown, re-seat the data cable, power backup and run a scrub. It is a little worrisome since you only have mirrors and if the other drive in the mirror set faults while this one is faulted, you loose the entire pool.

Chris Moore · Jan 5, 2019

Pliqui said:
SUPERMICRO MBD-X11SSL-CF Micro ATX

Did you flash that integrated SAS controller to the IT mode firmware?

Pliqui · Jan 5, 2019

Chris Moore said:
Did you flash that integrated SAS controller to the IT mode firmware?

Hi Chris, I did not flash it since it wasn't required, all disk are presented to FreeNAS without any raid configuration and this was a new buil and I copied the data over.

I will check the cables to validate, did not did it right away since I triggered the long test.

Thanks for your reply.

Cheers,

Chris Moore · Jan 5, 2019

Hard Drive Troubleshooting Guide (All Versions of FreeNAS)
https://forums.freenas.org/index.ph...bleshooting-guide-all-versions-of-freenas.17/

Pliqui · Jan 6, 2019

Hey sir,

Thank you very much.

That was one of the first thing I read and since I find the issue kinda strange is why I decided to tap the collective knowledge.

Long test will finish soon and I will check the cables then and if the problem is not fixed I will order a new drive and run some tests on the default one before RMA.

I follow the 3-2-1 for backups, so is not a big issue.

As always, I really appreciate your time to drop a line.

Cheers,

Pliqui · Jan 6, 2019

After checking the cables and make sure they were connected right (they were)

The pool came ONLINE with minimum CKSUM errors

Code:

[root@freenas ~]# zpool status -v
  pool: HOME
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 0 days 11:06:22 with 0 errors on Tue Jan  1 13:06:22 2019
config:

        NAME                                            STATE     READ WRITE CKSUM
        HOME                                            ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/df322087-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
            gptid/dfb5acdc-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            gptid/e03c826e-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
            gptid/e0c5c181-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
          mirror-2                                      ONLINE       0     0     0
            gptid/d80c60dc-d264-11e8-afaf-ac1f6b83f450  ONLINE       0     0     4
            gptid/d88326db-d264-11e8-afaf-ac1f6b83f450  ONLINE       0     0     0

errors: No known data errors

and the SMART looks like it was aborted by host.

Code:

[root@freenas ~]# smartctl -x /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     HGST HDN721010ALE604
Serial Number:    1DGS64JZ
LU WWN Device Id: 5 000cca 26cca8b9b
Firmware Version: 83XN
User Capacity:    10,000,831,348,736 bytes [10.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jan  6 18:17:55 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (   93) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (1304) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     PO-R--   100   100   016    -    0
  2 Throughput_Performance  --S---   134   134   054    -    96
  3 Spin_Up_Time            POS---   100   100   024    -    0
  4 Start_Stop_Count        -O--C-   100   100   000    -    3
  5 Reallocated_Sector_Ct   PO--CK   100   100   005    -    0
  7 Seek_Error_Rate         -O-R--   100   100   067    -    0
  8 Seek_Time_Performance   --S---   128   128   020    -    18
  9 Power_On_Hours          -O--C-   100   100   000    -    1943
 10 Spin_Retry_Count        -O--C-   100   100   060    -    0
 12 Power_Cycle_Count       -O--CK   100   100   000    -    3
 22 Unknown_Attribute       PO---K   100   100   025    -    100
192 Power-Off_Retract_Count -O--CK   100   100   000    -    81
193 Load_Cycle_Count        -O--C-   100   100   000    -    81
194 Temperature_Celsius     -O----   176   176   000    -    34 (Min/Max 25/40)
196 Reallocated_Event_Count -O--CK   100   100   000    -    0
197 Current_Pending_Sector  -O---K   100   100   000    -    0
198 Offline_Uncorrectable   ---R--   100   100   000    -    0
199 UDMA_CRC_Error_Count    -O-R--   200   200   000    -    0
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      1  Comprehensive SMART error log
0x03       GPL     R/O      1  Ext. Comprehensive SMART error log
0x04       GPL     R/O    256  Device Statistics log
0x04       SL      R/O    255  Device Statistics log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x08       GPL     R/O      2  Power Conditions log
0x09           SL  R/W      1  Selective self-test log
0x0c       GPL     R/O   5501  Pending Defects log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x12       GPL     R/O      1  SATA NCQ Non-Data log
0x13       GPL     R/O      1  SATA NCQ Send and Receive log
0x15       GPL     R/W      1  Rebuild Assist log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x24       GPL     R/O    256  Current Device Internal Status Data log
0x25       GPL     R/O    256  Saved Device Internal Status Data log
0x30       GPL,SL  R/O      9  IDENTIFY DEVICE data log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1935         -
# 2  Extended offline    Aborted by host               90%      1914         -
# 3  Short offline       Completed without error       00%      1855         -
# 4  Short offline       Completed without error       00%      1807         -
# 5  Short offline       Completed without error       00%      1783         -
# 6  Short offline       Completed without error       00%      1735         -
# 7  Short offline       Completed without error       00%      1687         -
# 8  Short offline       Completed without error       00%      1638         -
# 9  Short offline       Completed without error       00%      1590         -
#10  Extended offline    Completed without error       00%      1588         -
#11  Short offline       Completed without error       00%      1542         -
#12  Short offline       Completed without error       00%      1494         -
#13  Short offline       Completed without error       00%      1446         -
#14  Short offline       Completed without error       00%      1398         -
#15  Short offline       Completed without error       00%      1350         -
#16  Short offline       Completed without error       00%      1302         -
#17  Short offline       Completed without error       00%      1254         -
#18  Extended offline    Completed without error       00%      1252         -
#19  Short offline       Completed without error       00%      1206         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       256 (0x0100)
SCT Support Level:                   0
Device State:                        Active (0)
Current Temperature:                    34 Celsius
Power Cycle Min/Max Temperature:     34/34 Celsius
Lifetime    Min/Max Temperature:     25/40 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Temperature History Version:     2
Temperature Sampling Period:         1 minute
Temperature Logging Interval:        1 minute
Min/Max recommended Temperature:      0/60 Celsius
Min/Max Temperature Limit:           -40/70 Celsius
Temperature History Size (Index):    128 (63)

Index    Estimated Time   Temperature Celsius
  64    2019-01-06 16:10    35  ****************
 ...    ..(116 skipped).    ..  ****************
  53    2019-01-06 18:07    35  ****************
  54    2019-01-06 18:08    34  ***************
 ...    ..(  7 skipped).    ..  ***************
  62    2019-01-06 18:16    34  ***************
  63    2019-01-06 18:17    35  ****************

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) supported [please try: '-l defects']

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x0001  2            0  Command failed due to ICRC error
0x0002  2            0  R_ERR response for data FIS
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0005  2            0  R_ERR response for non-data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
0x0008  2            0  Device-to-host non-data FIS retries
0x0009  2            2  Transition from drive PhyRdy to drive PhyNRdy
0x000a  2            3  Device-to-host register FISes sent due to a COMRESET
0x000b  2            0  CRC errors within host-to-device FIS
0x000d  2            0  Non-CRC errors within host-to-device FIS

I will keep an eye on it since another long test comes in a few days and the scrub the 15th.

Pliqui · Jan 11, 2019

Not long ago after my previous log the disk was in a degraded mode again. But last night I got this error again

Code:

Device: /dev/da4 [SAT], Read SMART Self-Test Log Failed
Device: /dev/da4 [SAT], Read SMART Error Log Failed
Device: /dev/da4 [SAT], not capable of SMART self-check
Device: /dev/da3 [SAT], failed to read SMART Attribute Data
Device: /dev/da4 [SAT], failed to read SMART Attribute Data
Device: /dev/da3 [SAT], Read SMART Self-Test Log Failed
The volume HOME state is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.
Device: /dev/da3 [SAT], not capable of SMART self-check
Device: /dev/da3 [SAT], Read SMART Error Log Failed

and now more drives are in a degrade mode

Code:

[root@freenas /var/log]# zpool status -v HOME
  pool: HOME
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: scrub in progress since Fri Jan 11 01:50:55 2019
        5.47T scanned at 324M/s, 5.19T issued at 307M/s, 12.5T total
        1.83M repaired, 41.69% done, 0 days 06:52:43 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        HOME                                            DEGRADED     0     0     0
          mirror-0                                      DEGRADED     0     0     0
            gptid/df322087-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
            gptid/dfb5acdc-7949-11e8-8cf1-ac1f6b83f450  DEGRADED     0     0 1.09K  too many errors  (repairing)
          mirror-1                                      DEGRADED     0     0     0
            gptid/e03c826e-7949-11e8-8cf1-ac1f6b83f450  DEGRADED     0     0   955  too many errors  (repairing)
            gptid/e0c5c181-7949-11e8-8cf1-ac1f6b83f450  ONLINE       0     0     0
          mirror-2                                      DEGRADED     0     0     0
            gptid/d80c60dc-d264-11e8-afaf-ac1f6b83f450  DEGRADED     0     0    99  too many errors  (repairing)
            gptid/d88326db-d264-11e8-afaf-ac1f6b83f450  ONLINE       0     0     0

errors: No known data errors

I rebooted the server at 1:30 and started a scrub around 2:00 am but nothing on the message log

Code:

[root@freenas /var/log]# cat messages
Jan 11 01:31:50 freenas syslog-ng[1764]: syslog-ng starting up; version='3.7.3'
Jan 11 01:31:50 freenas Waiting (max 60 seconds) for system process `vnlru' to stop... done
Jan 11 01:31:50 freenas Waiting (max 60 seconds) for system process `bufdaemon' to stop... done
Jan 11 01:31:50 freenas Waiting (max 60 seconds) for system process `syncer' to stop...
Jan 11 01:31:50 freenas Syncing disks, vnodes remaining... 0 0 0 0 0 0 done
Jan 11 01:31:50 freenas All buffers synced.
Jan 11 01:31:50 freenas GEOM_ELI: Device mirror/swap0.eli destroyed.
Jan 11 01:31:50 freenas GEOM_ELI: Detached mirror/swap0.eli on last close.
Jan 11 01:31:50 freenas GEOM_ELI: Device mirror/swap1.eli destroyed.
Jan 11 01:31:50 freenas GEOM_ELI: Detached mirror/swap1.eli on last close.
Jan 11 01:31:50 freenas GEOM_ELI: Device mirror/swap2.eli destroyed.
Jan 11 01:31:50 freenas GEOM_ELI: Detached mirror/swap2.eli on last close.
Jan 11 01:31:50 freenas Uptime: 4d7h16m48s
Jan 11 01:31:50 freenas GEOM_MIRROR: Device swap2: provider destroyed.
Jan 11 01:31:50 freenas GEOM_MIRROR: Device swap2 destroyed.
Jan 11 01:31:50 freenas GEOM_MIRROR: Device swap1: provider destroyed.
Jan 11 01:31:50 freenas GEOM_MIRROR: Device swap1 destroyed.
Jan 11 01:31:50 freenas GEOM_MIRROR: Device swap0: provider destroyed.
Jan 11 01:31:50 freenas GEOM_MIRROR: Device swap0 destroyed.
Jan 11 01:31:50 freenas (da3:mpr0:0:7:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Jan 11 01:31:50 freenas (da3:mpr0:0:7:0): CAM status: SCSI Status Error
Jan 11 01:31:50 freenas (da3:mpr0:0:7:0): SCSI status: Check Condition
Jan 11 01:31:50 freenas (da3:mpr0:0:7:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Jan 11 01:31:50 freenas (da3:mpr0:0:7:0): Error 6, Retries exhausted
Jan 11 01:31:50 freenas (da3:mpr0:0:7:0): Synchronize cache failed
Jan 11 01:31:50 freenas (da4:mpr0:0:8:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Jan 11 01:31:50 freenas (da4:mpr0:0:8:0): CAM status: SCSI Status Error
Jan 11 01:31:50 freenas (da4:mpr0:0:8:0): SCSI status: Check Condition
Jan 11 01:31:50 freenas (da4:mpr0:0:8:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Jan 11 01:31:50 freenas (da4:mpr0:0:8:0): Error 6, Retries exhausted
Jan 11 01:31:50 freenas (da4:mpr0:0:8:0): Synchronize cache failed
Jan 11 01:31:50 freenas Copyright (c) 1992-2017 The FreeBSD Project.
Jan 11 01:31:50 freenas Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Jan 11 01:31:50 freenas         The Regents of the University of California. All rights reserved.
Jan 11 01:31:50 freenas FreeBSD is a registered trademark of The FreeBSD Foundation.
Jan 11 01:31:50 freenas FreeBSD 11.1-STABLE #0 r321665+6bdfda58b62(HEAD): Thu Dec 20 14:27:36 EST 2018
Jan 11 01:31:50 freenas root@mp19.lab.ixsystems.com:/freenas-releng-final/freenas/_BE/objs/freenas-releng-final/freenas/_BE/os/sys/FreeNAS.amd64 amd64
Jan 11 01:31:50 freenas FreeBSD clang version 5.0.0 (tags/RELEASE_500/final 312559) (based on LLVM 5.0.0svn)
Jan 11 01:31:50 freenas VT(efifb): resolution 1024x768
Jan 11 01:31:50 freenas CPU: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz (3504.17-MHz K8-class CPU)
Jan 11 01:31:50 freenas Origin="GenuineIntel"  Id=0x906e9  Family=0x6  Model=0x9e  Stepping=9
Jan 11 01:31:50 freenas Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Jan 11 01:31:50 freenas Features2=0x7ffafbff<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,TSCDLT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
Jan 11 01:31:50 freenas AMD Features=0x2c100800<SYSCALL,NX,Page1GB,RDTSCP,LM>
Jan 11 01:31:50 freenas AMD Features2=0x121<LAHF,ABM,Prefetch>
Jan 11 01:31:50 freenas Structured Extended Features=0x29c6fbf<FSGSBASE,TSCADJ,SGX,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,NFPUSG,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PROCTRACE>
Jan 11 01:31:50 freenas XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
Jan 11 01:31:50 freenas VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
Jan 11 01:31:50 freenas TSC: P-state invariant, performance statistics
Jan 11 01:31:50 freenas real memory  = 70728548352 (67452 MB)
Jan 11 01:31:50 freenas avail memory = 66528944128 (63446 MB)
Jan 11 01:31:50 freenas Event timer "LAPIC" quality 600
Jan 11 01:31:50 freenas ACPI APIC Table: < >
Jan 11 01:31:50 freenas FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
Jan 11 01:31:50 freenas FreeBSD/SMP: 1 package(s) x 4 core(s) x 2 hardware threads
Jan 11 01:31:50 freenas WARNING: VIMAGE (virtualized network stack) is a highly experimental feature.
Jan 11 01:31:50 freenas ioapic0 <Version 2.0> irqs 0-23 on motherboard
Jan 11 01:31:50 freenas SMP: AP CPU #1 Launched!
Jan 11 01:31:50 freenas SMP: AP CPU #2 Launched!
Jan 11 01:31:50 freenas SMP: AP CPU #3 Launched!
Jan 11 01:31:50 freenas SMP: AP CPU #4 Launched!
Jan 11 01:31:50 freenas SMP: AP CPU #5 Launched!
Jan 11 01:31:50 freenas SMP: AP CPU #6 Launched!
Jan 11 01:31:50 freenas SMP: AP CPU #7 Launched!
Jan 11 01:31:50 freenas Timecounter "TSC-low" frequency 1752086686 Hz quality 1000
Jan 11 01:31:50 freenas random: entropy device external interface
Jan 11 01:31:50 freenas kbd1 at kbdmux0
Jan 11 01:31:50 freenas module_register_init: MOD_LOAD (vesa, 0xffffffff80fc84c0, 0) error 19
Jan 11 01:31:50 freenas random: registering fast source Intel Secure Key RNG
Jan 11 01:31:50 freenas random: fast provider: "Intel Secure Key RNG"
Jan 11 01:31:50 freenas nexus0
Jan 11 01:31:50 freenas cryptosoft0: <software crypto> on motherboard
Jan 11 01:31:50 freenas aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM> on motherboard
Jan 11 01:31:50 freenas padlock0: No ACE support.
Jan 11 01:31:50 freenas acpi0: <SUPERM SUPERM> on motherboard
Jan 11 01:31:50 freenas acpi0: Power Button (fixed)
Jan 11 01:31:50 freenas unknown: memory range not supported
Jan 11 01:31:50 freenas cpu0: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu1: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu2: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu3: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu4: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu5: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu6: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas cpu7: <ACPI CPU> on acpi0
Jan 11 01:31:50 freenas hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Jan 11 01:31:50 freenas Timecounter "HPET" frequency 24000000 Hz quality 950
Jan 11 01:31:50 freenas Event timer "HPET" frequency 24000000 Hz quality 550
Jan 11 01:31:50 freenas atrtc0: <AT realtime clock> port 0x70-0x77 irq 8 on acpi0
Jan 11 01:31:50 freenas atrtc0: Warning: Couldn't map I/O.
Jan 11 01:31:50 freenas atrtc0: registered as a time-of-day clock, resolution 1.000000s
Jan 11 01:31:50 freenas Event timer "RTC" frequency 32768 Hz quality 0
Jan 11 01:31:50 freenas attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Jan 11 01:31:50 freenas Timecounter "i8254" frequency 1193182 Hz quality 0
Jan 11 01:31:50 freenas Event timer "i8254" frequency 1193182 Hz quality 100
Jan 11 01:31:50 freenas Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
Jan 11 01:31:50 freenas acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1808-0x180b on acpi0
Jan 11 01:31:50 freenas pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Jan 11 01:31:50 freenas pci0: <ACPI PCI bus> on pcib0
Jan 11 01:31:50 freenas pcib1: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci0
Jan 11 01:31:50 freenas pci1: <ACPI PCI bus> on pcib1
Jan 11 01:31:50 freenas pcib2: <ACPI PCI-PCI bridge> irq 16 at device 1.1 on pci0
Jan 11 01:31:50 freenas pci2: <ACPI PCI bus> on pcib2
Jan 11 01:31:50 freenas mpr0: <Avago Technologies (LSI) SAS3008> port 0xe000-0xe0ff mem 0xdf240000-0xdf24ffff,0xdf200000-0xdf23ffff irq 17 at device 0.0 on pci2
Jan 11 01:31:50 freenas mpr0: Firmware: 10.00.00.00, Driver: 18.03.00.00-fbsd
Jan 11 01:31:50 freenas mpr0: IOCCapabilities: 6985c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR,MSIXIndex,FastPath,RDPQArray>
Jan 11 01:31:50 freenas pci0: <old, non-VGA display device> at device 19.0 (no driver attached)
Jan 11 01:31:50 freenas xhci0: <Intel Sunrise Point USB 3.0 controller> mem 0xdf500000-0xdf50ffff irq 16 at device 20.0 on pci0
Jan 11 01:31:50 freenas xhci0: 32 bytes context size, 64-bit DMA
Jan 11 01:31:50 freenas usbus0 on xhci0
Jan 11 01:31:50 freenas usbus0: 5.0Gbps Super Speed USB v3.0
Jan 11 01:31:50 freenas pci0: <simple comms> at device 22.0 (no driver attached)
Jan 11 01:31:50 freenas ahci0: <Intel Sunrise Point AHCI SATA controller> port 0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0xdf510000-0xdf511fff,0xdf51e000-0xdf51e0ff,0xdf51d000-0xdf51d7ff irq 16 at device 23.0 on pci0
Jan 11 01:31:50 freenas ahci0: AHCI v1.31 with 6 6Gbps ports, Port Multiplier not supported
Jan 11 01:31:50 freenas ahcich0: <AHCI channel> at channel 0 on ahci0
Jan 11 01:31:50 freenas ahcich1: <AHCI channel> at channel 1 on ahci0
Jan 11 01:31:50 freenas ahcich2: <AHCI channel> at channel 2 on ahci0
Jan 11 01:31:50 freenas ahcich3: <AHCI channel> at channel 3 on ahci0
Jan 11 01:31:50 freenas ahcich4: <AHCI channel> at channel 4 on ahci0
Jan 11 01:31:50 freenas ahcich5: <AHCI channel> at channel 5 on ahci0
Jan 11 01:31:50 freenas pcib3: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
Jan 11 01:31:50 freenas pci3: <ACPI PCI bus> on pcib3
Jan 11 01:31:50 freenas igb0: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xd000-0xd01f mem 0xdf400000-0xdf47ffff,0xdf480000-0xdf483fff irq 16 at device 0.0 on pci3
Jan 11 01:31:50 freenas igb0: Using MSIX interrupts with 5 vectors
Jan 11 01:31:50 freenas igb0: Ethernet address: ac:1f:6b:83:f4:50
Jan 11 01:31:50 freenas igb0: Bound queue 0 to cpu 0
Jan 11 01:31:50 freenas igb0: Bound queue 1 to cpu 1
Jan 11 01:31:50 freenas igb0: Bound queue 2 to cpu 2
Jan 11 01:31:50 freenas igb0: Bound queue 3 to cpu 3
Jan 11 01:31:50 freenas pcib4: <ACPI PCI-PCI bridge> irq 17 at device 28.5 on pci0
Jan 11 01:31:50 freenas pci4: <ACPI PCI bus> on pcib4
Jan 11 01:31:50 freenas igb1: <Intel(R) PRO/1000 Network Connection, Version - 2.5.3-k> port 0xc000-0xc01f mem 0xdf300000-0xdf37ffff,0xdf380000-0xdf383fff irq 17 at device 0.0 on pci4
Jan 11 01:31:50 freenas igb1: Using MSIX interrupts with 5 vectors
Jan 11 01:31:50 freenas igb1: Ethernet address: ac:1f:6b:83:f4:51
Jan 11 01:31:50 freenas igb1: Bound queue 0 to cpu 4
Jan 11 01:31:50 freenas igb1: Bound queue 1 to cpu 5
Jan 11 01:31:50 freenas igb1: Bound queue 2 to cpu 6
Jan 11 01:31:50 freenas igb1: Bound queue 3 to cpu 7
Jan 11 01:31:50 freenas pcib5: <ACPI PCI-PCI bridge> irq 18 at device 28.6 on pci0
Jan 11 01:31:50 freenas pci5: <ACPI PCI bus> on pcib5
Jan 11 01:31:50 freenas pcib6: <ACPI PCI-PCI bridge> at device 0.0 on pci5
Jan 11 01:31:50 freenas pci6: <ACPI PCI bus> on pcib6
Jan 11 01:31:50 freenas vgapci0: <VGA-compatible display> port 0xb000-0xb07f mem 0xde000000-0xdeffffff,0xdf000000-0xdf01ffff irq 18 at device 0.0 on pci6
Jan 11 01:31:50 freenas vgapci0: Boot video device
Jan 11 01:31:50 freenas pcib7: <ACPI PCI-PCI bridge> irq 16 at device 29.0 on pci0
Jan 11 01:31:50 freenas pci7: <ACPI PCI bus> on pcib7
Jan 11 01:31:50 freenas isab0: <PCI-ISA bridge> at device 31.0 on pci0
Jan 11 01:31:50 freenas isa0: <ISA bus> on isab0
Jan 11 01:31:50 freenas pci0: <memory> at device 31.2 (no driver attached)
Jan 11 01:31:50 freenas acpi_button0: <Sleep Button> on acpi0
Jan 11 01:31:50 freenas acpi_button1: <Power Button> on acpi0
Jan 11 01:31:50 freenas acpi_tz0: <Thermal Zone> on acpi0
Jan 11 01:31:50 freenas acpi_tz1: <Thermal Zone> on acpi0
Jan 11 01:31:50 freenas uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
Jan 11 01:31:50 freenas uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
Jan 11 01:31:50 freenas orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xce000-0xcefff on isa0
Jan 11 01:31:50 freenas coretemp0: <CPU On-Die Thermal Sensors> on cpu0
Jan 11 01:31:50 freenas est0: <Enhanced SpeedStep Frequency Control> on cpu0
Jan 11 01:31:50 freenas coretemp1: <CPU On-Die Thermal Sensors> on cpu1
Jan 11 01:31:50 freenas est1: <Enhanced SpeedStep Frequency Control> on cpu1
Jan 11 01:31:50 freenas coretemp2: <CPU On-Die Thermal Sensors> on cpu2
Jan 11 01:31:50 freenas est2: <Enhanced SpeedStep Frequency Control> on cpu2
Jan 11 01:31:50 freenas coretemp3: <CPU On-Die Thermal Sensors> on cpu3
Jan 11 01:31:50 freenas est3: <Enhanced SpeedStep Frequency Control> on cpu3
Jan 11 01:31:50 freenas coretemp4: <CPU On-Die Thermal Sensors> on cpu4
Jan 11 01:31:50 freenas est4: <Enhanced SpeedStep Frequency Control> on cpu4
Jan 11 01:31:50 freenas coretemp5: <CPU On-Die Thermal Sensors> on cpu5
Jan 11 01:31:50 freenas est5: <Enhanced SpeedStep Frequency Control> on cpu5
Jan 11 01:31:50 freenas coretemp6: <CPU On-Die Thermal Sensors> on cpu6
Jan 11 01:31:50 freenas est6: <Enhanced SpeedStep Frequency Control> on cpu6
Jan 11 01:31:50 freenas coretemp7: <CPU On-Die Thermal Sensors> on cpu7
Jan 11 01:31:50 freenas est7: <Enhanced SpeedStep Frequency Control> on cpu7
Jan 11 01:31:50 freenas ZFS filesystem version: 5
Jan 11 01:31:50 freenas ZFS storage pool version: features support (5000)
Jan 11 01:31:50 freenas Timecounters tick every 1.000 msec
Jan 11 01:31:50 freenas freenas_sysctl: adding account.
Jan 11 01:31:50 freenas freenas_sysctl: adding directoryservice.
Jan 11 01:31:50 freenas freenas_sysctl: adding middlewared.
Jan 11 01:31:50 freenas freenas_sysctl: adding network.
Jan 11 01:31:50 freenas freenas_sysctl: adding services.
Jan 11 01:31:50 freenas ipfw2 (+ipv6) initialized, divert enabled, nat enabled, default to accept, logging disabled
Jan 11 01:31:50 freenas ugen0.1: <0x8086 XHCI root HUB> at usbus0
Jan 11 01:31:50 freenas uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
Jan 11 01:31:50 freenas uhub0: 22 ports with 22 removable, self powered
Jan 11 01:31:50 freenas ugen0.2: <vendor 0x0557 product 0x7000> at usbus0
Jan 11 01:31:50 freenas uhub1 on uhub0
Jan 11 01:31:50 freenas uhub1: <vendor 0x0557 product 0x7000, class 9/0, rev 2.00/0.00, addr 1> on usbus0
Jan 11 01:31:50 freenas mpr0: SAS Address for SATA device = 72d507575747c8c
Jan 11 01:31:50 freenas mpr0: SAS Address from SAS device page0 = 4433221102000000
Jan 11 01:31:50 freenas mpr0: SAS Address from SATA device = 72d507575747c8c
Jan 11 01:31:50 freenas mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x0009> enclosureHandle<0x0001> slot 2
Jan 11 01:31:50 freenas mpr0: At enclosure level 0 and connector name (    )
Jan 11 01:31:50 freenas uhub1: 4 ports with 3 removable, self powered
Jan 11 01:31:50 freenas mpr0: SAS Address for SATA device = 84f397575747c89
Jan 11 01:31:50 freenas mpr0: SAS Address from SAS device page0 = 4433221103000000
Jan 11 01:31:50 freenas mpr0: SAS Address from SATA device = 84f397575747c89
Jan 11 01:31:50 freenas mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000a> enclosureHandle<0x0001> slot 3
Jan 11 01:31:50 freenas mpr0: At enclosure level 0 and connector name (    )
Jan 11 01:31:50 freenas mpr0: SAS Address for SATA device = d3e5a739a7a846c
Jan 11 01:31:50 freenas mpr0: SAS Address from SAS device page0 = 4433221104000000
Jan 11 01:31:50 freenas mpr0: SAS Address from SATA device = d3e5a739a7a846c
Jan 11 01:31:50 freenas mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000b> enclosureHandle<0x0001> slot 4
Jan 11 01:31:50 freenas mpr0: At enclosure level 0 and connector name (    )
Jan 11 01:31:50 freenas mpr0: SAS Address for SATA device = 323e3e749666838b
Jan 11 01:31:50 freenas mpr0: SAS Address from SAS device page0 = 4433221105000000
Jan 11 01:31:50 freenas mpr0: SAS Address from SATA device = 323e3e749666838b
Jan 11 01:31:50 freenas mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000c> enclosureHandle<0x0001> slot 5
Jan 11 01:31:50 freenas mpr0: At enclosure level 0 and connector name (    )
Jan 11 01:31:50 freenas ugen0.3: <vendor 0x0557 product 0x2419> at usbus0
Jan 11 01:31:50 freenas ukbd0 on uhub1
Jan 11 01:31:50 freenas ukbd0: <vendor 0x0557 product 0x2419, class 0/0, rev 1.10/1.00, addr 2> on usbus0
Jan 11 01:31:50 freenas kbd2 at ukbd0
Jan 11 01:31:50 freenas mpr0: SAS Address for SATA device = 205247749666838b
Jan 11 01:31:50 freenas mpr0: SAS Address from SAS device page0 = 4433221106000000
Jan 11 01:31:50 freenas mpr0: SAS Address from SATA device = 205247749666838b
Jan 11 01:31:50 freenas mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000d> enclosureHandle<0x0001> slot 6
Jan 11 01:31:50 freenas mpr0: At enclosure level 0 and connector name (    )
Jan 11 01:31:50 freenas mpr0: SAS Address for SATA device = a2c5a749666838c
Jan 11 01:31:50 freenas mpr0: SAS Address from SAS device page0 = 4433221107000000
Jan 11 01:31:50 freenas mpr0: SAS Address from SATA device = a2c5a749666838c
Jan 11 01:31:50 freenas mpr0: Found device <881<SataDev,Direct>,End Device> <6.0Gbps> handle<0x000e> enclosureHandle<0x0001> slot 7
Jan 11 01:31:50 freenas mpr0: At enclosure level 0 and connector name (    )
Jan 11 01:31:50 freenas ada0 at ahcich0 bus 0 scbus1 target 0 lun 0
Jan 11 01:31:50 freenas ada0: da1 at mpr0 bus 0 scbus0 target 5 lun 0
Jan 11 01:31:50 freenas da0 at mpr0 bus 0 scbus0 target 4 lun 0
Jan 11 01:31:50 freenas syslog-ng[1764]: Error processing log message: <INTEL SSDSC2KI128G8 LHF001D> ACS-3 ATA SATA 3.x device
Jan 11 01:31:50 freenas da1: <ATA HGST HDN721010AL 83XN> Fixed Direct Access SPC-4 SCSI device
Jan 11 01:31:50 freenas da1: Serial Number 1DGP7V3Z
Jan 11 01:31:50 freenas da1: 600.000MB/s transfers
Jan 11 01:31:50 freenas da1: Command Queueing enabled
Jan 11 01:31:50 freenas da1: 9537536MB (19532873728 512 byte sectors)
Jan 11 01:31:50 freenas da3 at mpr0 bus 0 scbus0 target 7 lun 0
Jan 11 01:31:50 freenas da3: ada0: Serial Number PHLA805006TV128BGN
Jan 11 01:31:50 freenas ada0: 600.000MB/s transfers<ATA HGST HDN728080AL W91X> Fixed Direct Access SPC-4 SCSI device
Jan 11 01:31:50 freenas da3: Serial Number R6GRZE8Y
Jan 11 01:31:50 freenas da3: 600.000MB/s transfers
Jan 11 01:31:50 freenas da3: Command Queueing enabled
Jan 11 01:31:50 freenas da3: 7630885MB (15628053168 512 byte sectors)
Jan 11 01:31:50 freenas da5 at mpr0 bus 0 scbus0 target 9 lun 0
Jan 11 01:31:50 freenas da5: <ATA HGST HDN728080AL W91X> Fixed Direct Access SPC-4 SCSI device
Jan 11 01:31:50 freenas da5: Serial Number R6GS23TY
Jan 11 01:31:50 freenas da5: 600.000MB/s transfers
Jan 11 01:31:50 freenas da5: Command Queueing enabled
Jan 11 01:31:50 freenas da5: 7630885MB (15628053168 512 byte sectors)
Jan 11 01:31:50 freenas da0: <ATA HGST HDN721010AL 83XN> Fixed Direct Access SPC-4 SCSI device
Jan 11 01:31:50 freenas da0: Serial Number 1DGS64JZ
Jan 11 01:31:50 freenas da0: 600.000MB/s transfers
Jan 11 01:31:50 freenas da0: Command Queueing enabled
Jan 11 01:31:50 freenas da0: 9537536MB (19532873728 512 byte sectors)
Jan 11 01:31:50 freenas da2 at mpr0 bus 0 scbus0 target 6 lun 0
Jan 11 01:31:50 freenas da2: <ATA HGST HDN728080AL W91X> Fixed Direct Access SPC-4 SCSI device
Jan 11 01:31:50 freenas da2: Serial Number VJH35ETX
Jan 11 01:31:50 freenas da2: 600.000MB/s transfers (
Jan 11 01:31:50 freenas SATA 3.x, da2: Command Queueing enabled
Jan 11 01:31:50 freenas da2: 7630885MB (15628053168 512 byte sectors)
Jan 11 01:31:50 freenas da4 at mpr0 bus 0 scbus0 target 8 lun 0
Jan 11 01:31:50 freenas da4: <ATA HGST HDN728080AL W91X> Fixed Direct Access SPC-4 SCSI device
Jan 11 01:31:50 freenas da4: Serial Number R6GRHYAY
Jan 11 01:31:50 freenas da4: 600.000MB/s transfers
Jan 11 01:31:50 freenas da4: Command Queueing enabled
Jan 11 01:31:50 freenas da4: 7630885MB (15628053168 512 byte sectors)
Jan 11 01:31:50 freenas UDMA6, PIO 8192bytes)
Jan 11 01:31:50 freenas ada0: Command Queueing enabled
Jan 11 01:31:50 freenas ada0: 122104MB (250069680 512 byte sectors)
Jan 11 01:31:50 freenas ada1 at ahcich1 bus 0 scbus2 target 0 lun 0
Jan 11 01:31:50 freenas ada1: <MTFDDAK128MAY-1AH1ZABHA M504> ACS-2 ATA SATA 3.x device
Jan 11 01:31:50 freenas ada1: Serial Number 14230C40CB23
Jan 11 01:31:50 freenas ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Jan 11 01:31:50 freenas ada1: Command Queueing enabled
Jan 11 01:31:50 freenas ada1: 122104MB (250069680 512 byte sectors)
Jan 11 01:31:50 freenas random: unblocking device.
Jan 11 01:31:50 freenas Trying to mount root from zfs:freenas-boot/ROOT/11.1-U6.3 []...
Jan 11 01:31:50 freenas kernel: igb0: link state changed to UP
Jan 11 01:31:50 freenas kernel: igb0: link state changed to UP
Jan 11 01:31:50 freenas ipmi0: <IPMI System Interface> port 0xca2,0xca3 on acpi0
Jan 11 01:31:50 freenas ipmi0: KCS mode found at io 0xca2 on acpi
Jan 11 01:31:50 freenas ipmi0: IPMI device rev. 1, firmware rev. 1.39, version 2.0
Jan 11 01:31:50 freenas ipmi0: Number of channels 2
Jan 11 01:31:50 freenas ipmi0: Attached watchdog
Jan 11 01:31:50 freenas GEOM_RAID5: Module loaded, version 1.3.20140711.62 (rev f91e28e40bf7)
Jan 11 01:31:50 freenas GEOM_MIRROR: Device mirror/swap0 launched (2/2).
Jan 11 01:31:50 freenas GEOM_MIRROR: Device mirror/swap1 launched (2/2).
Jan 11 01:31:50 freenas GEOM_MIRROR: Device mirror/swap2 launched (2/2).
Jan 11 01:31:50 freenas GEOM_ELI: Device mirror/swap0.eli created.
Jan 11 01:31:50 freenas GEOM_ELI: Encryption: AES-XTS 128
Jan 11 01:31:50 freenas GEOM_ELI:     Crypto: hardware
Jan 11 01:31:50 freenas GEOM_ELI: Device mirror/swap1.eli created.
Jan 11 01:31:50 freenas GEOM_ELI: Encryption: AES-XTS 128
Jan 11 01:31:50 freenas GEOM_ELI:     Crypto: hardware
Jan 11 01:31:50 freenas GEOM_ELI: Device mirror/swap2.eli created.
Jan 11 01:31:50 freenas GEOM_ELI: Encryption: AES-XTS 128
Jan 11 01:31:50 freenas GEOM_ELI:     Crypto: hardware
Jan 11 01:31:50 freenas pmc: Unknown Intel CPU.
Jan 11 01:31:50 freenas hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI>
Jan 11 01:31:50 freenas kernel: igb0: link state changed to DOWN
Jan 11 01:31:50 freenas kernel: igb0: link state changed to DOWN
Jan 11 01:31:50 freenas ums0 on uhub1
Jan 11 01:31:50 freenas ums0: <vendor 0x0557 product 0x2419, class 0/0, rev 1.10/1.00, addr 2> on usbus0
Jan 11 01:31:50 freenas ums0: 3 buttons and [Z] coordinates ID=0
Jan 11 01:31:50 freenas kernel: igb0: link state changed to UP
Jan 11 01:31:50 freenas kernel: igb0: link state changed to UP
Jan 11 06:31:50 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 2
Jan 11 06:31:50 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 2
Jan 11 06:31:50 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 2
Jan 11 06:31:50 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 2
Jan 11 06:31:51 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 3
Jan 11 06:31:51 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 3
Jan 11 06:31:51 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 3
Jan 11 06:31:51 freenas python3.6: dnssd_clientstub ConnectToServer: connect()-> No of tries: 3
Jan 11 01:31:52 freenas /adtool: [common.pipesubr:65] Popen()ing: /usr/bin/kinit --renewable --password-file=/tmp/tmp255vs5u1 srvc_freenas@ABOCOR.COM
Jan 11 06:31:52 freenas python3.6: dnssd_clientstub ConnectToServer: connect() failed path:/var/run/mdnsd Socket:37 Err:-1 Errno:2 No such file or directory
Jan 11 06:31:52 freenas python3.6: dnssd_clientstub ConnectToServer: connect() failed path:/var/run/mdnsd Socket:32 Err:-1 Errno:2 No such file or directory
Jan 11 06:31:52 freenas python3.6: dnssd_clientstub ConnectToServer: connect() failed path:/var/run/mdnsd Socket:23 Err:-1 Errno:2 No such file or directory
Jan 11 06:31:52 freenas python3.6: dnssd_clientstub ConnectToServer: connect() failed path:/var/run/mdnsd Socket:24 Err:-1 Errno:2 No such file or directory
Jan 11 01:31:53 freenas ntpd[2698]: ntpd 4.2.8p10-a (1): Starting
Jan 11 01:32:00 freenas root: /etc/rc: WARNING: failed precmd routine for minio
Jan 11 01:32:07 freenas bridge0: Ethernet address: 02:b6:56:b3:f5:00
Jan 11 01:32:07 freenas kernel: bridge0: link state changed to UP
Jan 11 01:32:07 freenas kernel: bridge0: link state changed to UP
Jan 11 01:32:07 freenas kernel: igb0: promiscuous mode enabled
Jan 11 01:32:07 freenas epair0a: Ethernet address: 02:1f:50:00:05:0a
Jan 11 01:32:07 freenas epair0b: Ethernet address: 02:1f:a0:00:06:0b
Jan 11 01:32:07 freenas kernel: epair0a: link state changed to UP
Jan 11 01:32:07 freenas kernel: epair0a: link state changed to UP
Jan 11 01:32:07 freenas kernel: epair0b: link state changed to UP
Jan 11 01:32:07 freenas kernel: epair0b: link state changed to UP
Jan 11 01:32:07 freenas kernel: igb0: link state changed to DOWN
Jan 11 01:32:07 freenas kernel: igb0: link state changed to DOWN
Jan 11 01:32:07 freenas kernel: epair0a: promiscuous mode enabled
Jan 11 01:32:12 freenas kernel: igb0: link state changed to UP
Jan 11 01:32:12 freenas kernel: igb0: link state changed to UP
Jan 11 01:32:13 freenas tap0: Ethernet address: 00:bd:77:1c:f8:00
Jan 11 01:32:13 freenas kernel: tap0: promiscuous mode enabled
Jan 11 01:32:13 freenas kernel: tap0: link state changed to UP
Jan 11 01:32:13 freenas kernel: tap0: link state changed to UP
Jan 11 01:50:55 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=11463232353864117636
Jan 11 01:50:55 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=5723202499937601249
Jan 11 01:50:55 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=14406377114253992047
Jan 11 01:50:55 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=2257977093707773979
Jan 11 01:50:55 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=17491896652372963773
Jan 11 01:50:55 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=8871488267840872129
Jan 11 01:51:45 freenas kernel: arp: 192.168.0.8 moved from 02:1f:50:00:05:0a to ac:1f:6b:83:f4:50 on epair0b
Jan 11 02:03:45 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=14406377114253992047
Jan 11 02:15:03 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=5723202499937601249
Jan 11 03:15:56 freenas ZFS: vdev state changed, pool_guid=17933338950115770683 vdev_guid=17491896652372963773

I will replace the original disk with the error were the replacement gets here on Tuesday and also going to run some memtest during the weekend to validate memory.

Important Announcement for the TrueNAS Community.

ZPOOL Dedegraded - S.M.A.R.T is ok

Pliqui

Dabbler

Pliqui

Dabbler

Chris Moore

Hall of Famer

Chris Moore

Hall of Famer

Pliqui

Dabbler

Chris Moore

Hall of Famer

Pliqui

Dabbler

Pliqui

Dabbler

Pliqui

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

ZPOOL Dedegraded - S.M.A.R.T is ok

Dabbler

Dabbler

Hall of Famer

Hall of Famer

Dabbler

Hall of Famer

Dabbler

Dabbler

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "ZPOOL Dedegraded - S.M.A.R.T is ok"

Similar threads