Should I be worried? ZFS Degraded -> Faulted -> Online

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
One significant difference with ZFS vs a hardware RAID controller, which is the source of the legend, that I forgot to mention earlier. ZFS is only copying the data, not the entire drive. ZFS ignores empty space, where hardware RAID is going to exercise the entire drive, not just the data space.

(holy crow it's so easy it's hard.)
yea, I know how zfs tracks what is data and what is empty, unlike both hardware and software raid, it's one of the reasons I like it so much. (I had pfsense with geom setup and it just constantly was sending out resync alerts for each frigging percentage, detecting some unknown desync occurrence - bleah. switched it to zfs, so much better, even though zfs management ain't in the GUI as of yet and I had to make my own alerting)

Also, each of the donor disks in the RAIDz pool is only needing to do the work of accessing 256GB of data instead of accessing a full 1TB of data, so it is less stressful for the RAIDz pool.

ok, I see now what you mean. so in such a scenario, the parity calc and other overhead on a reasonably modern performing system should be virtually nil, adding up to read speed being a non limiter, and thus making the write speed to the new disk the only bottleneck?

does that apply in reverse, wherein writing 256GB to 4 targets realistically gets you an aggregate write speed (technically 6 disks if raidz2)?
I have seen it argued that mirror will give the best performance (believe one of the proponents was a zfs dev), but I have also seen it argued that striped raidz will give the best performance.

if you were making a pool out of 4 disks, would you choose 2mirrors or raidz2? the storage lost should be the same, so the only difference would be quirks of how they function (lose any 2 disks or lose 1 disk in each mirror vdev) + admin ease

I do hope the resizing raidz gets released, because that would greatly reduce the management advantage that mirrors have, assuming it works correctly of course.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I see now what you mean. so in such a scenario, the parity calc and other overhead on a reasonably modern performing system should be virtually nil, adding up to read speed being a non limiter, and thus making the write speed to the new disk the only bottleneck?
Yes. In my home NAS, during a resilver of a disk, I usually see CPU utilization around 20% and the disks being read from will be sending out data around 30 MB/s where the target disk is packing data in at 120 MB/s or more. I don't have the fastest disks at home. I see even faster rebuild times on the systems at work because the drives have better throughput.
does that apply in reverse, wherein writing 256GB to 4 targets realistically gets you an aggregate write speed (technically 6 disks if raidz2)?
The resilver of a failed disk is a pretty unique scenario because ZFS is handling it internally. When data is being received from outside, it travels through a little different software path to get to disk, so performance doesn't always scale the way you might expect. The "rule of thumb" that I use is to expect the performance of a single disk from any vdev and if it does better than that, it is a bonus. In mirror vdevs, that is very reliable, but in RAIDz vdevs, you get better performance under certain circumstances. Random IO is not usually one of them but streaming large files, sometimes it will be very good.
I have seen it argued that mirror will give the best performance (believe one of the proponents was a zfs dev), but I have also seen it argued that striped raidz will give the best performance.
They each have their place. Different tools for different jobs.
It is generally a matter of how many disks you can throw at the problem, and what problem you are trying to solve. If you need random IO (IOPS) for virtual machines or databases, many vdevs are required and the way to get many vdevs, without needin massive numbers of disks, is mirrors. I have a system at work that has ten RAIDz2 vdevs of six drives each, sixty drives. It is quite fast when loading large, sequential files, but it struggles a bit with small random files. It would likely do better if it were configured with mirror vdevs, but at a significant cost to storage capacity.
if you were making a pool out of 4 disks, would you choose 2mirrors or raidz2?
Again, it is down to the purpose of the pool. They would each have about the same storage capacity, but the mirrors would give you two vdevs, so about twice the IOPS as the RAIDz2 version. If you wanted to use the pool to host virtaul machines or a database, the mirrors would give you more performance. If you are streaming movies from Plex, you may as well use RAIDz2 for the additional safety of being able to loose any combination of two drives. Network speed can play into that also. Is it on a 10Gb network? You might squeak out a little more bandwidth if you had mirror vdevs, but no guarantee.
+ admin ease
Administrative issues are a consideration, but I usually try to plan capacity well in advance of need. I try to start the process of ordering additional storage when the pool gets above 50% capacity. I would be in panic mode if the pool ever hit 80% and we have fairly long lead times at work. I put in paperwork for a new server the middle of last year and have not got it yet. It would be bringing with it another 60 drives, 12TB each, to add to the capacity on one of the networks, which will allow me to retire the server I have that is still running after 6.2 years of service.
I do hope the resizing raidz gets released, because that would greatly reduce the management advantage that mirrors have, assuming it works correctly of course.
It might be a convenience, but I think it is better to simply add another vdev, even if you are doing RAIDz2. I agree there are some management pain points with ZFS and using mirror vdevs can make some of those easier to get around, but mirrors are very costly because you must use more disks to obtain equivalent storage capacity. An example from where I work: I have a server using RAIDz2, with about 300TB of capacity. If I were to use the same number of disks in mirror vdevs, I would only have around 230TB of capacity. That is 70TB of capacity that I have a hard time throwing out, but it isn't just the capacity, it is the redundancy. With RAIDz2, I can loose any two disks, if all disks fail in different vdevs, I could loose 10 disks and still have a disk of redundancy in every vdev and not be in danger of loosing the pool. With mirrors, I can't loose any two adjacent disks. If all failures were in different vdevs, I could survive 30 drive failures, but not two in the same vdev. It is just too risky for me, I have had two drives fail in the same vdev and the only thing that saved the pool was having a hot-spare. I am only willing to entertain RAIDz2 (instead of 3) because I have a nightly backup.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
Again, it is down to the purpose of the pool. They would each have about the same storage capacity, but the mirrors would give you two vdevs, so about twice the IOPS as the RAIDz2 version. If you wanted to use the pool to host virtaul machines or a database, the mirrors would give you more performance. If you are streaming movies from Plex, you may as well use RAIDz2 for the additional safety of being able to loose any combination of two drives. Network speed can play into that also. Is it on a 10Gb network? You might squeak out a little more bandwidth if you had mirror vdevs, but no guarantee.

I am refering to the the purpose of this thread that we have kind of hijacked, which was asking about an existing (non recommended hardware) setup with a raidz1 that was throwing a ton of errors, so the purpose of the drives is the OP's purpose, which I noted that raidz2 of 2 drives would have no storage advantage over mirrors, but from what you have been adding, it looks like the 4 disk raidz2 is probably an acceptable choice for them since it would give them true 2 drives redundancy.
 

pro lamer

Guru
Joined
Feb 16, 2018
Messages
626
btank 8TB raidz1 (Mostly for reusue of my 2tb drives) Might need to think about this some more.
2TB
2TB
2TB
2TB
2TB
Better make this pool raidz2: even if the 2TB drives are small enough to be safe to run raidz1, they will fail some day - if you run raidz2 you can replace them with bigger ones step by step without worrying of using >> 2TB drives in raidz1...

Anyway I'd recommend you answered Chris's questions first, maybe start with this one
I would like to know about that VMware question but it looks like you have some hardware problems.
or this one
the full output of the SMART test with smartctl -X /dev/ada0
or some other ;)

Sent from my phone
 

gar

Dabbler
Joined
Jan 1, 2014
Messages
14
I just noticed this. Are you passing disks into a virtual machine?
Is this FreeNAS running inside virtualization? You might have mentioned that?

I see where you posted the output of the SMART report and you have some CRC errors on all the disks. That usually points to a cabling issue. I also see some "runtime badblocks" which are not a good indicator for disk health.

I would like to know about that VMware question but it looks like you have some hardware problems.

Coming back to this after a month, been to busy to look into it more.
Yeah its a VM with PCI passthrough. The SATA controller is passed directly to the VM. FreeNAS has access to the individual physical disk hardware.

Anyway I'd recommend you answered Chris's questions first, maybe start with this one

or this one

or some other ;)

Sent from my phone

Posted it in #5 but will re-post as individuals.
Thinks look pretty stable right now. Planning on moving to a backup disk then to a raidz2 this weekend just for peace of mind.

Code:
root@NAS:~ # smartctl -x /dev/ada0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1CH164
Serial Number:    Z1E57PJS
LU WWN Device Id: 5 000c50 0647acbb4
Firmware Version: CC27
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat May 11 13:31:29 2019 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (  17) The self-test routine was aborted by
                                        the host.
Total time to complete Offline
data collection:                (  592) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 226) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   110   099   006    -    26138568
  3 Spin_Up_Time            PO----   096   095   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    283
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  7 Seek_Error_Rate         POSR--   087   060   030    -    548120108
  9 Power_On_Hours          -O--CK   072   072   000    -    24662
10 Spin_Retry_Count        PO--C-   100   100   097    -    0
12 Power_Cycle_Count       -O--CK   100   100   020    -    276
183 Runtime_Bad_Block       -O--CK   099   099   000    -    1
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   099   000    -    0 0 335
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   066   049   045    -    34 (Min/Max 21/36)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    151
193 Load_Cycle_Count        -O--CK   100   100   000    -    1197
194 Temperature_Celsius     -O---K   034   051   000    -    34 (0 10 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   199   000    -    15731
240 Head_Flying_Hours       ------   100   253   000    -    24619h+54m+27.173s
241 Total_LBAs_Written      ------   100   253   000    -    71710747732
242 Total_LBAs_Read         ------   100   253   000    -    156689067839
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS     129  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5176  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL,SL  VS      10  Device vendor specific log
0xc4       GPL,SL  VS       5  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Aborted by host               10%     24662         -
# 2  Short offline       Completed without error       00%     24660         -
# 3  Short offline       Completed without error       00%     24648         -
# 4  Short offline       Completed without error       00%     24630         -
# 5  Short offline       Completed without error       00%     24618         -
# 6  Short offline       Completed without error       00%     24582         -
# 7  Short offline       Completed without error       00%     24570         -
# 8  Short offline       Completed without error       00%     24534         -
# 9  Short offline       Completed without error       00%     24522         -
#10  Short offline       Completed without error       00%     24498         -
#11  Short offline       Completed without error       00%     24468         -
#12  Short offline       Interrupted (host reset)      00%     24456         -
#13  Short offline       Interrupted (host reset)      00%     24420         -
#14  Short offline       Completed without error       00%     24408         -
#15  Short offline       Interrupted (host reset)      00%     24371         -
#16  Short offline       Interrupted (host reset)      00%     24359         -
#17  Extended offline    Interrupted (host reset)      00%     24358         -
#18  Short offline       Completed without error       00%     24324         -
#19  Short offline       Completed without error       00%     24312         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       522 (0x020a)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    34 Celsius
Power Cycle Min/Max Temperature:     21/36 Celsius
Lifetime    Min/Max Temperature:     10/51 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2           10  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
 

gar

Dabbler
Joined
Jan 1, 2014
Messages
14
ada1 and ada2

Code:
root@NAS:~ # smartctl -x /dev/ada1
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1CH164
Serial Number:    Z1E5676K
LU WWN Device Id: 5 000c50 06478d75f
Firmware Version: CC27
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat May 11 13:32:17 2019 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (  17) The self-test routine was aborted by
                                        the host.
Total time to complete Offline
data collection:                (  584) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 217) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   105   099   006    -    8521360
  3 Spin_Up_Time            PO----   096   095   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    284
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  7 Seek_Error_Rate         POSR--   087   060   030    -    573191287
  9 Power_On_Hours          -O--CK   072   072   000    -    24819
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    278
183 Runtime_Bad_Block       -O--CK   093   093   000    -    7
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   100   000    -    0 0 4
189 High_Fly_Writes         -O-RCK   100   100   000    -    0
190 Airflow_Temperature_Cel -O---K   068   049   045    -    32 (Min/Max 21/34)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    154
193 Load_Cycle_Count        -O--CK   100   100   000    -    1205
194 Temperature_Celsius     -O---K   032   051   000    -    32 (0 11 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   160   000    -    151
240 Head_Flying_Hours       ------   100   253   000    -    24772h+19m+24.269s
241 Total_LBAs_Written      ------   100   253   000    -    71716884475
242 Total_LBAs_Read         ------   100   253   000    -    157009682667
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS     129  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5176  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL,SL  VS      10  Device vendor specific log
0xc4       GPL,SL  VS       5  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Aborted by host               10%     24819         -
# 2  Short offline       Completed without error       00%     24818         -
# 3  Short offline       Completed without error       00%     24806         -
# 4  Short offline       Completed without error       00%     24788         -
# 5  Short offline       Completed without error       00%     24776         -
# 6  Short offline       Completed without error       00%     24740         -
# 7  Short offline       Completed without error       00%     24728         -
# 8  Short offline       Completed without error       00%     24692         -
# 9  Short offline       Completed without error       00%     24680         -
#10  Short offline       Completed without error       00%     24656         -
#11  Short offline       Completed without error       00%     24626         -
#12  Short offline       Completed without error       00%     24614         -
#13  Short offline       Completed without error       00%     24578         -
#14  Short offline       Completed without error       00%     24566         -
#15  Extended offline    Completed without error       00%     24557         -
#16  Short offline       Completed without error       00%     24481         -
#17  Short offline       Completed without error       00%     24469         -
#18  Short offline       Completed without error       00%     24433         -
#19  Short offline       Completed without error       00%     24421         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       522 (0x020a)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    32 Celsius
Power Cycle Min/Max Temperature:     21/33 Celsius
Lifetime    Min/Max Temperature:     11/50 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2           10  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS

Code:
root@NAS:~ # smartctl -x /dev/ada2
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.1-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.14 (AF)
Device Model:     ST2000DM001-1CH164
Serial Number:    Z1E5674Y
LU WWN Device Id: 5 000c50 06478d3d3
Firmware Version: CC27
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat May 11 13:32:45 2019 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is:   Unavailable
APM feature is:   Disabled
Rd look-ahead is: Enabled
Write cache is:   Enabled
DSN feature is:   Unavailable
ATA Security is:  Disabled, NOT FROZEN [SEC1]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (  17) The self-test routine was aborted by
                                        the host.
Total time to complete Offline
data collection:                (  584) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 213) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR--   116   099   006    -    113470032
  3 Spin_Up_Time            PO----   096   095   000    -    0
  4 Start_Stop_Count        -O--CK   100   100   020    -    281
  5 Reallocated_Sector_Ct   PO--CK   100   100   010    -    0
  7 Seek_Error_Rate         POSR--   078   060   030    -    39196753900
  9 Power_On_Hours          -O--CK   072   072   000    -    24816
 10 Spin_Retry_Count        PO--C-   100   100   097    -    0
 12 Power_Cycle_Count       -O--CK   100   100   020    -    277
183 Runtime_Bad_Block       -O--CK   098   098   000    -    2
184 End-to-End_Error        -O--CK   100   100   099    -    0
187 Reported_Uncorrect      -O--CK   100   100   000    -    0
188 Command_Timeout         -O--CK   100   100   000    -    0 0 0
189 High_Fly_Writes         -O-RCK   099   099   000    -    1
190 Airflow_Temperature_Cel -O---K   068   049   045    -    32 (Min/Max 21/34)
191 G-Sense_Error_Rate      -O--CK   100   100   000    -    0
192 Power-Off_Retract_Count -O--CK   100   100   000    -    153
193 Load_Cycle_Count        -O--CK   100   100   000    -    1205
194 Temperature_Celsius     -O---K   032   051   000    -    32 (0 11 0 0 0)
197 Current_Pending_Sector  -O--C-   100   100   000    -    0
198 Offline_Uncorrectable   ----C-   100   100   000    -    0
199 UDMA_CRC_Error_Count    -OSRCK   200   160   000    -    81
240 Head_Flying_Hours       ------   100   253   000    -    24772h+06m+34.673s
241 Total_LBAs_Written      ------   100   253   000    -    71683889391
242 Total_LBAs_Read         ------   100   253   000    -    156613619297
                            ||||||_ K auto-keep
                            |||||__ C event count
                            ||||___ R error rate
                            |||____ S speed/performance
                            ||_____ O updated online
                            |______ P prefailure warning

General Purpose Log Directory Version 1
SMART           Log Directory Version 1 [multi-sector log support]
Address    Access  R/W   Size  Description
0x00       GPL,SL  R/O      1  Log Directory
0x01           SL  R/O      1  Summary SMART error log
0x02           SL  R/O      5  Comprehensive SMART error log
0x03       GPL     R/O      5  Ext. Comprehensive SMART error log
0x06           SL  R/O      1  SMART self-test log
0x07       GPL     R/O      1  Extended self-test log
0x09           SL  R/W      1  Selective self-test log
0x10       GPL     R/O      1  NCQ Command Error log
0x11       GPL     R/O      1  SATA Phy Event Counters log
0x21       GPL     R/O      1  Write stream error log
0x22       GPL     R/O      1  Read stream error log
0x80-0x9f  GPL,SL  R/W     16  Host vendor specific log
0xa1       GPL,SL  VS      20  Device vendor specific log
0xa2       GPL     VS    4496  Device vendor specific log
0xa8       GPL,SL  VS     129  Device vendor specific log
0xa9       GPL,SL  VS       1  Device vendor specific log
0xab       GPL     VS       1  Device vendor specific log
0xb0       GPL     VS    5176  Device vendor specific log
0xbe-0xbf  GPL     VS   65535  Device vendor specific log
0xc0       GPL,SL  VS       1  Device vendor specific log
0xc1       GPL,SL  VS      10  Device vendor specific log
0xc4       GPL,SL  VS       5  Device vendor specific log
0xe0       GPL,SL  R/W      1  SCT Command/Status
0xe1       GPL,SL  R/W      1  SCT Data Transfer

SMART Extended Comprehensive Error Log Version: 1 (5 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Aborted by host               10%     24816         -
# 2  Short offline       Completed without error       00%     24814         -
# 3  Short offline       Completed without error       00%     24802         -
# 4  Short offline       Completed without error       00%     24784         -
# 5  Short offline       Completed without error       00%     24772         -
# 6  Short offline       Completed without error       00%     24736         -
# 7  Short offline       Completed without error       00%     24724         -
# 8  Short offline       Completed without error       00%     24688         -
# 9  Short offline       Completed without error       00%     24676         -
#10  Short offline       Completed without error       00%     24652         -
#11  Short offline       Completed without error       00%     24622         -
#12  Short offline       Completed without error       00%     24611         -
#13  Short offline       Completed without error       00%     24574         -
#14  Short offline       Completed without error       00%     24562         -
#15  Extended offline    Completed without error       00%     24553         -
#16  Short offline       Completed without error       00%     24478         -
#17  Short offline       Completed without error       00%     24466         -
#18  Short offline       Completed without error       00%     24430         -
#19  Short offline       Completed without error       00%     24418         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Status Version:                  3
SCT Version (vendor specific):       522 (0x020a)
SCT Support Level:                   1
Device State:                        Active (0)
Current Temperature:                    31 Celsius
Power Cycle Min/Max Temperature:     21/33 Celsius
Lifetime    Min/Max Temperature:     11/50 Celsius
Under/Over Temperature Limit Count:   0/0

SCT Data Table command not supported

SCT Error Recovery Control command not supported

Device Statistics (GP/SMART Log 0x04) not supported

Pending Defects log (GP Log 0x0c) not supported

SATA Phy Event Counters (GP Log 0x11)
ID      Size     Value  Description
0x000a  2           10  Device-to-host register FISes sent due to a COMRESET
0x0001  2            0  Command failed due to ICRC error
0x0003  2            0  R_ERR response for device-to-host data FIS
0x0004  2            0  R_ERR response for host-to-device data FIS
0x0006  2            0  R_ERR response for device-to-host non-data FIS
0x0007  2            0  R_ERR response for host-to-device non-data FIS
 

gar

Dabbler
Joined
Jan 1, 2014
Messages
14
Phase 1: Near term. Fix and protect current setup. Budget is key limiter here.
1. I found a friend that will lend me his 4tb drive. I will use this to backup all current data.
2. I will purchase another 2tb drive, new PSU, and SATA cables.
3. I will blow away the raidz1, install the new drive, giving me 4x 2tb, and set up a new pool as radiz2.
4. Restore data to new pool.

Phase 1 has been implemented successfully. Thanks for all the help. With raidz2 now in place and one of the disks being brand new, I feel alot safer.
 
Top