Errors during file transfer

Status
Not open for further replies.

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I have an Ubuntu live CD.. but how do I actually do these tests? I did some searching and see references to them, but not how to actually perform them.
1) figure out how the drives are discovered in the live session. (i.e. /dev/sda, /dev/sdb, /dev/sdc, etc).
2) run "smartctl -t conveyance /dev/sda" (modify to select the correct drive) to run the conveyance test. run "smartctl -t long /dev/sda" to run the long test.
3) if smartmontools aren't installed in the default live environment, connect to network and install "sudo apt-get install smartmontools".
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
I think I figured it out.. I booted my Ubuntu live cd and just did a search for SMART and it came up with a disk section and a place to do tests.

I'm not sure how to post smartctl output.. but here are my screenshots... It's not looking good for the first drive:


Smart 1.PNG
Smart 2.PNG
Smart 3.PNG
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
I got smartmontools installed finally. here's what I get from it:

upload_2016-2-8_15-6-11.png



Looks like a bad servo to me. This was done on different system. Is there anything else I should do before I call to RMA the drive?
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
The second mirrored drive had the exact same results

I opened the other two HE8 drives I had and the short test, just to see what a successful (hopefully) test would look like. they did pass, but one thing I noticed Immediately was how quiet the other two drives were. those first two must have been a bad lot because they both had the exact same very noticeable and very noisy vibration.

Thank you to everyone for the help! I would have been just lost, thinking I had something wrong.

I'm going to do the long smart test on each of these other two new drives first and then do burn-in tests before I try to set this up again.

Just as an update, I was unable to install the V20.00.04.00 SAS firmware from that other board. It said it was successful with writing it, but then it failed at resetting it and I couldn't boot with that version installed. I reverted back to V20.00.00.00 that supermicro provided and it is back to normal now.

I have a call into supermicro to see if they can provide me V20.00.04.00
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
I went with HGST because they have such high reliability statistics. It must have been a bad lot of drives.. they both have a very loud rotational vibration, but the two new ones run as smooth as expected.. I can even hear the heads move in the new ones, something that was impossible with the defective drives because the vibration was so loud.
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
This Situation has me wondering... I should probably wait to get the replacement drives and use one of the replacements with one of the good drives I still have.. If there are issues related to batches of drives, then splitting the batch would decrease my odds of a total failure due to a bad batch of whatever component could fail.

I'm also wondering.. Can I make a 3 Drive mirror with two HGST HE8 8TB drives and one Seagate Archive 8TB Drive? or is a bad idea to mix manufacturers like that.. perhaps some kind of mismatch will hurt performance?
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Can I make a 3 Drive mirror with two HGST HE8 8TB drives and one Seagate Archive 8TB Drive?
Yep you can do that.

or is a bad idea to mix manufacturers like that.. perhaps some kind of mismatch will hurt performance?
Nope, some actually do prefer a heterogeneous setup to try and add a little more security/diversity.
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
Yep you can do that.


Nope, some actually do prefer a heterogeneous setup to try and add a little more security/diversity.

I know that when mirroring mixed drives the pool is limited to the size of the smallest drive in the mirror, but I don't see a problem with that.

I just wonder if I will end up limiting my speed to the slowest drive..

but perhaps it doesn't even work that way.. maybe when writing it just writes to whatever drive it can then uses the fast drives to cache the slow drives... and when reading it just reads all the drives at once in a way that if one drive can't keep up, it really doesn't hurt anything to be there.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I know that when mirroring mixed drives the pool is limited to the size of the smallest drive in the mirror, but I don't see a problem with that.

I just wonder if I will end up limiting my speed to the slowest drive..

but perhaps it doesn't even work that way.. maybe when writing it just writes to whatever drive it can then uses the fast drives to cache the slow drives... and when reading it just reads all the drives at once in a way that if one drive can't keep up, it really doesn't hurt anything to be there.
Once you weed out the drives that will die prematurely (infant mortality), in general NAS drives are highly reliable. I wouldn't worry too much about having a pool of those 8TB he drives. Just plan your storage to meet your space, redundancy, and performance requirements. And make sure you have backups. Zfs replication to a second freenas system is a pretty simple and effective backup strategy.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Once you weed out the drives that will die prematurely (infant mortality), in general NAS drives are highly reliable. I wouldn't worry too much about having a pool of those 8TB he drives. Just plan your storage to meet your space, redundancy, and performance requirements. And make sure you have backups. Zfs replication to a second freenas system is a pretty simple and effective backup strategy.
Unless they're 3TB Seagates. Those will drop like flies.
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
Maybe these HGST HE8 drives aren't so great either.

Ok, so here's the latest... I went ahead and built my 3 mirrored drive array using two HGST and one Seagate drives. I tried doing some of the Smart Tests, the conveyance test only works on the Seagate drive. on the HGST drives I get:
Code:
[root@freenas] ~# smartctl -t conveyance /dev/da0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Conveyance Self-test functions not supported

Sending command: "Execute SMART Conveyance self-test routine immediately in off-line mode".
Command "Execute SMART Conveyance self-test routine immediately in off-line mode" failed: scsi error aborted command


All drives passed the short test.

I Copied a bunch of test data to the array, and that was successful. I figured with 3 drives in the array, I can play with test data, pull one drive, do the long smart test on it, put it back in the array, (if it passes) pull the next one.. etc. There's no production data at this point so I can be learning more about configuring FreeNas and refine my implementation plan while these long tests are going on.

so far so good... until I looked at my console and noticed this:
now what.PNG


now what 2.PNG


/da0 = HGST HE8 drive
/da1 = HGST HE8 drive
/da2 = Seagate Archive drive

The two HGST drives have errors, the Seagate Drive does not.

I ran smartctl -a on each drive to see if it reported any information about the errors.. I don't understand the details of this, but I'm hoping one of the experts here will be able to tell me something. Here are the reports from all 3 drives:


HGST HE8
Code:
[root@freenas] ~# smartctl -a /dev/da0
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:  HGST HUH728080ALE600
Serial Number:  2EHS6LGX
LU WWN Device Id: 5 000cca 23bd8a4cc
Firmware Version: A4GNT7J0
User Capacity:  8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:  512 bytes logical, 4096 bytes physical
Rotation Rate:  7200 rpm
Form Factor:  3.5 inches
Device is:  Not in smartctl database [for details use: -P showall]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Tue Feb  9 05:15:31 2016 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
  was never started.
  Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0) The previous self-test routine completed
  without error or no self-test has ever
  been run.
Total time to complete Offline
data collection:  (  101) seconds.
Offline data collection
capabilities:  (0x5b) SMART execute Offline immediate.
  Auto Offline data collection on/off support.
  Suspend Offline collection upon new
  command.
  Offline surface scan supported.
  Self-test supported.
  No Conveyance Self-test supported.
  Selective Self-test supported.
SMART capabilities:  (0x0003) Saves SMART data before entering
  power-saving mode.
  Supports SMART auto save timer.
Error logging capability:  (0x01) Error logging supported.
  General Purpose Logging supported.
Short self-test routine
recommended polling time:  (  2) minutes.
Extended self-test routine
recommended polling time:  (1083) minutes.
SCT capabilities:  (0x003d) SCT Status supported.
  SCT Error Recovery Control supported.
  SCT Feature Control supported.
  SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  100  100  054  Pre-fail  Offline  -  0
  3 Spin_Up_Time  0x0007  100  100  024  Pre-fail  Always  -  0
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  2
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  100  100  020  Pre-fail  Offline  -  0
  9 Power_On_Hours  0x0012  100  100  000  Old_age  Always  -  16
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  2
 22 Unknown_Attribute  0x0023  100  100  025  Pre-fail  Always  -  100
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  2
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  2
194 Temperature_Celsius  0x0002  166  166  000  Old_age  Always  -  36 (Min/Max 22/38)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  7

SMART Error Log Version: 1
ATA Error Count: 7 (device log contains only the most recent five errors)
  CR = Command Register [HEX]
  FR = Features Register [HEX]
  SC = Sector Count Register [HEX]
  SN = Sector Number Register [HEX]
  CL = Cylinder Low Register [HEX]
  CH = Cylinder High Register [HEX]
  DH = Device/Head Register [HEX]
  DC = Device Command Register [HEX]
  ER = Error register [HEX]
  ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 7 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 10 e8 14 5b 40 00  15:04:06.173  READ FPDMA QUEUED
  60 00 00 e8 15 5b 40 00  15:04:06.021  READ FPDMA QUEUED
  60 00 08 e8 13 5b 40 00  15:04:06.020  READ FPDMA QUEUED
  60 00 00 e8 ef 5a 40 00  15:04:06.020  READ FPDMA QUEUED
  60 00 10 e8 ee 5a 40 00  15:04:06.018  READ FPDMA QUEUED

Error 6 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 e8 a2 5a 40 00  15:04:05.969  READ FPDMA QUEUED
  ea 00 00 00 00 00 00 00  15:03:39.015  FLUSH CACHE EXT
  61 08 10 98 29 81 40 00  15:03:39.014  WRITE FPDMA QUEUED
  61 08 00 98 27 81 40 00  15:03:39.014  WRITE FPDMA QUEUED
  61 08 08 98 03 40 40 00  15:03:39.013  WRITE FPDMA QUEUED

Error 5 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 e8 fa 59 40 00  14:58:49.824  READ FPDMA QUEUED
  60 00 10 e8 fb 59 40 00  14:58:49.824  READ FPDMA QUEUED
  60 00 00 e8 f9 59 40 00  14:58:49.824  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 00  14:58:49.811  READ LOG EXT
  2f 00 01 10 00 00 00 00  14:58:49.811  READ LOG EXT

Error 4 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 e8 f9 59 40 00  14:58:49.811  READ FPDMA QUEUED
  60 00 10 e8 fb 59 40 00  14:58:49.808  READ FPDMA QUEUED
  60 00 08 e8 fa 59 40 00  14:58:49.808  READ FPDMA QUEUED
  60 00 10 e8 f7 59 40 00  14:58:49.798  READ FPDMA QUEUED
  60 00 08 e8 f6 59 40 00  14:58:49.798  READ FPDMA QUEUED

Error 3 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 e8 f6 59 40 00  14:58:49.791  READ FPDMA QUEUED
  60 00 10 e8 f7 59 40 00  14:58:49.790  READ FPDMA QUEUED
  60 00 00 e8 f5 59 40 00  14:58:49.790  READ FPDMA QUEUED
  60 00 00 e8 f4 59 40 00  14:58:49.776  READ FPDMA QUEUED
  60 00 10 e8 f3 59 40 00  14:58:49.776  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  15  -
# 2  Short offline  Completed without error  00%  2  -
# 3  Short offline  Completed without error  00%  0  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.




HGST HE8
Code:
[root@freenas] ~# smartctl -a /dev/da1
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:  HGST HUH728080ALE600
Serial Number:  2EHS4VEX
LU WWN Device Id: 5 000cca 23bd89e41
Firmware Version: A4GNT7J0
User Capacity:  8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:  512 bytes logical, 4096 bytes physical
Rotation Rate:  7200 rpm
Form Factor:  3.5 inches
Device is:  Not in smartctl database [for details use: -P showall]
ATA Version is:  ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Tue Feb  9 05:17:53 2016 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80) Offline data collection activity
  was never started.
  Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0) The previous self-test routine completed
  without error or no self-test has ever
  been run.
Total time to complete Offline
data collection:  (  101) seconds.
Offline data collection
capabilities:  (0x5b) SMART execute Offline immediate.
  Auto Offline data collection on/off support.
  Suspend Offline collection upon new
  command.
  Offline surface scan supported.
  Self-test supported.
  No Conveyance Self-test supported.
  Selective Self-test supported.
SMART capabilities:  (0x0003) Saves SMART data before entering
  power-saving mode.
  Supports SMART auto save timer.
Error logging capability:  (0x01) Error logging supported.
  General Purpose Logging supported.
Short self-test routine
recommended polling time:  (  2) minutes.
Extended self-test routine
recommended polling time:  (1292) minutes.
SCT capabilities:  (0x003d) SCT Status supported.
  SCT Error Recovery Control supported.
  SCT Feature Control supported.
  SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000b  100  100  016  Pre-fail  Always  -  0
  2 Throughput_Performance  0x0005  100  100  054  Pre-fail  Offline  -  0
  3 Spin_Up_Time  0x0007  100  100  024  Pre-fail  Always  -  0
  4 Start_Stop_Count  0x0012  100  100  000  Old_age  Always  -  1
  5 Reallocated_Sector_Ct  0x0033  100  100  005  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000b  100  100  067  Pre-fail  Always  -  0
  8 Seek_Time_Performance  0x0005  100  100  020  Pre-fail  Offline  -  0
  9 Power_On_Hours  0x0012  100  100  000  Old_age  Always  -  16
 10 Spin_Retry_Count  0x0013  100  100  060  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  1
 22 Unknown_Attribute  0x0023  100  100  025  Pre-fail  Always  -  100
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  1
193 Load_Cycle_Count  0x0012  100  100  000  Old_age  Always  -  1
194 Temperature_Celsius  0x0002  166  166  000  Old_age  Always  -  36 (Min/Max 23/39)
196 Reallocated_Event_Count 0x0032  100  100  000  Old_age  Always  -  0
197 Current_Pending_Sector  0x0022  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0008  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x000a  200  200  000  Old_age  Always  -  8

SMART Error Log Version: 1
ATA Error Count: 8 (device log contains only the most recent five errors)
  CR = Command Register [HEX]
  FR = Features Register [HEX]
  SC = Sector Count Register [HEX]
  SN = Sector Number Register [HEX]
  CL = Cylinder Low Register [HEX]
  CH = Cylinder High Register [HEX]
  DH = Device/Head Register [HEX]
  DC = Device Command Register [HEX]
  ER = Error register [HEX]
  ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 8 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 18 e8 9c 5a 40 00  15:05:05.922  READ FPDMA QUEUED
  61 40 10 b0 86 f9 40 00  15:05:05.906  WRITE FPDMA QUEUED
  60 00 08 e8 9a 5a 40 00  15:05:05.906  READ FPDMA QUEUED
  61 08 10 00 7e f9 40 00  15:05:05.904  WRITE FPDMA QUEUED
  61 08 08 f0 86 f9 40 00  15:05:05.903  WRITE FPDMA QUEUED

Error 7 occurred at disk power-on lifetime: 15 hours (0 days + 15 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 e8 9a 5a 40 00  15:04:05.788  READ FPDMA QUEUED
  60 00 10 e8 9b 5a 40 00  15:04:05.787  READ FPDMA QUEUED
  60 00 00 e8 99 5a 40 00  15:04:05.787  READ FPDMA QUEUED
  ea 00 00 00 00 00 00 00  15:03:38.833  FLUSH CACHE EXT
  61 08 00 98 29 81 40 00  15:03:38.833  WRITE FPDMA QUEUED

Error 6 occurred at disk power-on lifetime: 14 hours (0 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 e8 3c 5a 40 00  14:58:49.632  READ FPDMA QUEUED
  60 00 00 e8 3b 5a 40 00  14:58:49.631  READ FPDMA QUEUED
  2f 00 01 10 00 00 00 00  14:58:49.631  READ LOG EXT
  2f 00 01 10 00 00 00 00  14:58:49.631  READ LOG EXT
  60 00 08 e8 3a 5a 40 00  14:58:49.628  READ FPDMA QUEUED

Error 5 occurred at disk power-on lifetime: 14 hours (0 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 e8 3a 5a 40 00  14:58:49.631  READ FPDMA QUEUED
  60 00 00 e8 39 5a 40 00  14:58:49.628  READ FPDMA QUEUED
  60 00 10 e8 38 5a 40 00  14:58:49.621  READ FPDMA QUEUED
  60 00 10 e8 37 5a 40 00  14:58:49.621  READ FPDMA QUEUED
  60 00 08 e8 35 5a 40 00  14:58:49.621  READ FPDMA QUEUED

Error 4 occurred at disk power-on lifetime: 14 hours (0 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 41 00 00 00 00 00  Error: ICRC, ABRT at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 e8 34 5a 40 00  14:58:49.611  READ FPDMA QUEUED
  60 00 10 e8 35 5a 40 00  14:58:49.610  READ FPDMA QUEUED
  60 00 00 e8 33 5a 40 00  14:58:49.610  READ FPDMA QUEUED
  60 00 00 e8 32 5a 40 00  14:58:49.605  READ FPDMA QUEUED
  60 00 10 e8 31 5a 40 00  14:58:49.604  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline  Completed without error  00%  15  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.




Seagate Archive
Code:
[root@freenas] ~# smartctl -a /dev/da2
smartctl 6.3 2014-07-26 r3976 [FreeBSD 9.3-RELEASE-p31 amd64] (local build)
Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:  ST8000AS0002-1NA17Z
Serial Number:  Z840A26X
LU WWN Device Id: 5 000c50 08732fc16
Firmware Version: AR15
User Capacity:  8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:  512 bytes logical, 4096 bytes physical
Rotation Rate:  5980 rpm
Device is:  Not in smartctl database [for details use: -P showall]
ATA Version is:  ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:  Tue Feb  9 05:20:14 2016 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
  was completed without error.
  Auto Offline Data Collection: Enabled.
Self-test execution status:  (  0) The previous self-test routine completed
  without error or no self-test has ever
  been run.
Total time to complete Offline
data collection:  (  0) seconds.
Offline data collection
capabilities:  (0x7b) SMART execute Offline immediate.
  Auto Offline data collection on/off support.
  Suspend Offline collection upon new
  command.
  Offline surface scan supported.
  Self-test supported.
  Conveyance Self-test supported.
  Selective Self-test supported.
SMART capabilities:  (0x0003) Saves SMART data before entering
  power-saving mode.
  Supports SMART auto save timer.
Error logging capability:  (0x01) Error logging supported.
  General Purpose Logging supported.
Short self-test routine
recommended polling time:  (  1) minutes.
Extended self-test routine
recommended polling time:  ( 932) minutes.
Conveyance self-test routine
recommended polling time:  (  2) minutes.
SCT capabilities:  (0x30a5) SCT Status supported.
  SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME  FLAG  VALUE WORST THRESH TYPE  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate  0x000f  117  100  006  Pre-fail  Always  -  143127696
  3 Spin_Up_Time  0x0003  099  099  000  Pre-fail  Always  -  0
  4 Start_Stop_Count  0x0032  100  100  020  Old_age  Always  -  2
  5 Reallocated_Sector_Ct  0x0033  100  100  010  Pre-fail  Always  -  0
  7 Seek_Error_Rate  0x000f  063  060  030  Pre-fail  Always  -  2014545
  9 Power_On_Hours  0x0032  100  100  000  Old_age  Always  -  13
 10 Spin_Retry_Count  0x0013  100  100  097  Pre-fail  Always  -  0
 12 Power_Cycle_Count  0x0032  100  100  020  Old_age  Always  -  2
183 Runtime_Bad_Block  0x0032  100  100  000  Old_age  Always  -  0
184 End-to-End_Error  0x0032  100  100  099  Old_age  Always  -  0
187 Reported_Uncorrect  0x0032  100  100  000  Old_age  Always  -  0
188 Command_Timeout  0x0032  100  100  000  Old_age  Always  -  0
189 High_Fly_Writes  0x003a  100  100  000  Old_age  Always  -  0
190 Airflow_Temperature_Cel 0x0022  064  060  045  Old_age  Always  -  36 (Min/Max 26/40)
191 G-Sense_Error_Rate  0x0032  100  100  000  Old_age  Always  -  0
192 Power-Off_Retract_Count 0x0032  100  100  000  Old_age  Always  -  1
193 Load_Cycle_Count  0x0032  100  100  000  Old_age  Always  -  3
194 Temperature_Celsius  0x0022  036  040  000  Old_age  Always  -  36 (0 22 0 0 0)
195 Hardware_ECC_Recovered  0x001a  117  100  000  Old_age  Always  -  143127696
197 Current_Pending_Sector  0x0012  100  100  000  Old_age  Always  -  0
198 Offline_Uncorrectable  0x0010  100  100  000  Old_age  Offline  -  0
199 UDMA_CRC_Error_Count  0x003e  200  200  000  Old_age  Always  -  1
240 Head_Flying_Hours  0x0000  100  253  000  Old_age  Offline  -  13 (124 66 0)
241 Total_LBAs_Written  0x0000  100  253  000  Old_age  Offline  -  4034239292
242 Total_LBAs_Read  0x0000  100  253  000  Old_age  Offline  -  241586

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description  Status  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Conveyance offline  Completed without error  00%  12  -
# 2  Short offline  Completed without error  00%  12  -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
  1  0  0  Not_testing
  2  0  0  Not_testing
  3  0  0  Not_testing
  4  0  0  Not_testing
  5  0  0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Any suggestions on what I should look at?
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
I'm also wondering how to get rid of the matchname errors shown on my console in my previous post
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
In your CIFS config uncheck the option for hostname lookups.
Do these indicate something not working correctly with my network? One thing odd about my network is my computers generally show up in the file explorer 'network' section, but FreeNas does not show up.. however I can make drive mapping to FreeNas with it's IP address. Most of my network is Windows 10


The UDMA CRC errors might be a cabling issue. Power off the machine and swap out cables.
Just to clarify... Are you talking about the SATA cables?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Do these indicate something not working correctly with my network? One thing odd about my network is my computers generally show up in the file explorer 'network' section, but FreeNas does not show up.. however I can make drive mapping to FreeNas with it's IP address. Most of my network is Windows 10



Just to clarify... Are you talking about the SATA cables?
The hostname lookup errors indicate that you don't have a local DNS server. Yes, SATA cables.
 

Zaaphod

Contributor
Joined
Dec 15, 2015
Messages
109
The hostname lookup errors indicate that you don't have a local DNS server.

That was on my list of things to figure out some day... I have an ESXi server running that I was thinking of putting a DNS server on. Any suggestions of what to use for this purpose?

Is there some way to clear existing SMART errors after I change cables?
 
Status
Not open for further replies.
Top