SOLVED Device detaches randomly [Help]

Status
Not open for further replies.

NadoNate

Dabbler
Joined
Aug 7, 2017
Messages
18
Hello,
I'm running into an error I can't seem to troubleshoot my way out of. I would appreciate any help!

System:

Freenas: FreeNAS-9.10.2-U6 (32GB USB 2)
Storage layout: 1 Pool - 5 4TB (Seagate) RaidZ1-0
Mainboard/CPU: ASUS M5A78L-M, AMD FX(tm)-6350 Six-Core Processor
Memory: 16080MB DDR3-1066
PSU: 500 WATT CoolerMaster 80 plus silver efficiency rating

Issue:
Device ada3 detaches randomly, almost once daily. It looks like it is reconnected at some point, and everything reports OK....?

npelChK.png


Code:

Aug  5 10:34:53 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  5 10:34:53 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  5 11:35:01 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  5 11:35:01 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  5 12:09:58 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  5 12:09:58 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug  5 12:09:58 lyonas GEOM_ELI: Device ada3p1.eli destroyed.
Aug  5 12:09:58 lyonas GEOM_ELI: Detached ada3p1.eli on last close.
Aug  5 12:09:58 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug  5 12:10:04 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  5 12:10:04 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug  5 12:10:04 lyonas ada3: Serial Number S3013JZR
Aug  5 12:10:04 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug  5 12:10:04 lyonas ada3: Command Queueing enabled
Aug  5 12:10:04 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug  5 12:10:04 lyonas ada3: quirks=0x1<4K>
Aug  5 12:10:04 lyonas ada3: Previously was known as ad12
Aug  5 12:10:05 lyonas ZFS: vdev state changed, pool_guid=13987490016491561389 vdev_guid=13611940111043943577
Aug  5 12:35:10 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  5 12:35:10 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  5 13:35:20 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  5 13:35:20 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  5 14:35:32 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  5 14:35:32 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  5 15:35:41 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  5 20:00:27 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:00:29 lyonas notifier: Performing sanity check on Samba configuration: OK
Aug  5 20:00:29 lyonas notifier: Reloading nmbd.
Aug  5 20:00:29 lyonas notifier: Reloading smbd.
Aug  5 20:00:29 lyonas notifier: Reloading winbindd.
Aug  5 20:00:29 lyonas smbd: dnssd_clientstub DNSServiceProcessResult called with DNSServiceRef with no ProcessReply function
Aug  5 20:00:29 lyonas notifier: Stopping mdnsd.
Aug  5 20:00:29 lyonas notifier: Waiting for PIDS: 2835.
Aug  5 20:00:29 lyonas notifier: Starting mdnsd.
Aug  5 20:00:29 lyonas notifier: Cannot 'reload' netatalk. Set netatalk_enable to YES in /etc/rc.conf or use 'onereload' instead of 'reload'.
Aug  5 20:02:04 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:02:13 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:02:13 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:02:15 lyonas notifier: Performing sanity check on Samba configuration: OK
Aug  5 20:02:15 lyonas notifier: Reloading nmbd.
Aug  5 20:02:15 lyonas notifier: Reloading smbd.
Aug  5 20:02:15 lyonas notifier: Reloading winbindd.
Aug  5 20:02:15 lyonas notifier: Stopping mdnsd.
Aug  5 20:02:15 lyonas notifier: Waiting for PIDS: 46115.
Aug  5 20:02:15 lyonas notifier: Starting mdnsd.
Aug  5 20:02:15 lyonas notifier: Cannot 'reload' netatalk. Set netatalk_enable to YES in /etc/rc.conf or use 'onereload' instead of 'reload'.
Aug  5 20:02:25 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:02:48 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:02:48 lyonas manage.py: [common.pipesubr:66] Popen()ing: zfs list -H -o mountpoint,name
Aug  5 20:02:50 lyonas notifier: Performing sanity check on Samba configuration: OK
Aug  5 20:02:50 lyonas notifier: Reloading nmbd.
Aug  5 20:02:50 lyonas notifier: Reloading smbd.
Aug  5 20:02:50 lyonas notifier: Reloading winbindd.
Aug  5 20:02:50 lyonas notifier: Stopping mdnsd.
Aug  5 20:02:50 lyonas notifier: Waiting for PIDS: 46834.
Aug  5 20:02:50 lyonas notifier: Starting mdnsd.
Aug  5 20:02:50 lyonas notifier: Cannot 'reload' netatalk. Set netatalk_enable to YES in /etc/rc.conf or use 'onereload' instead of 'reload'.
Aug  5 20:11:49 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  5 20:11:49 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug  5 20:11:49 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug  5 20:11:55 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  5 20:11:55 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug  5 20:11:55 lyonas ada3: Serial Number S3013JZR
Aug  5 20:11:55 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug  5 20:11:55 lyonas ada3: Command Queueing enabled
Aug  5 20:11:55 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug  5 20:11:55 lyonas ada3: quirks=0x1<4K>
Aug  5 20:11:55 lyonas ada3: Previously was known as ad12
Aug  5 20:12:10 lyonas ZFS: vdev state changed, pool_guid=13987490016491561389 vdev_guid=13611940111043943577
Aug  5 20:18:46 lyonas kernel: arp: 192.168.1.11 moved from 02:ff:a0:00:05:0a to 00:1b:21:91:e6:f6 on epair0b
Aug  6 00:00:00 lyonas syslog-ng[1480]: Configuration reload request received, reloading configuration;
Aug  6 00:37:01 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  6 00:37:01 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  6 01:37:09 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  6 01:37:09 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  6 02:22:45 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  6 02:22:45 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug  6 02:22:46 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug  6 02:22:51 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  6 02:22:51 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug  6 02:22:51 lyonas ada3: Serial Number S3013JZR
Aug  6 02:22:51 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug  6 02:22:51 lyonas ada3: Command Queueing enabled
Aug  6 02:22:51 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug  6 02:22:51 lyonas ada3: quirks=0x1<4K>
Aug  6 02:22:51 lyonas ada3: Previously was known as ad12
Aug  6 02:22:52 lyonas ZFS: vdev state changed, pool_guid=13987490016491561389 vdev_guid=13611940111043943577
Aug  6 02:27:18 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  6 02:27:18 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug  6 02:27:18 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug  6 02:27:24 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  6 02:27:24 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug  6 02:27:24 lyonas ada3: Serial Number S3013JZR
Aug  6 02:27:24 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug  6 02:27:24 lyonas ada3: Command Queueing enabled
Aug  6 02:27:24 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug  6 02:27:24 lyonas ada3: quirks=0x1<4K>
Aug  6 02:27:24 lyonas ada3: Previously was known as ad12
Aug  6 02:27:25 lyonas ZFS: vdev state changed, pool_guid=13987490016491561389 vdev_guid=13611940111043943577
Aug  7 00:40:36 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 00:40:36 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 01:40:43 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 01:40:43 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 02:40:51 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 02:40:51 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 03:40:58 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 03:40:58 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 04:41:05 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 04:41:05 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 05:41:13 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 05:41:13 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 06:41:20 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 06:41:20 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 07:41:30 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 07:41:30 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 08:41:39 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 08:41:39 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 09:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170807.0900-2w"
Aug  7 09:41:49 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 09:41:49 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 10:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170807.1000-2w"
Aug  7 10:41:59 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 10:41:59 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 11:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170807.1100-2w"
Aug  7 11:37:41 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  7 11:37:41 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug  7 11:37:41 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug  7 11:37:47 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  7 11:37:47 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug  7 11:37:47 lyonas ada3: Serial Number S3013JZR
Aug  7 11:37:47 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug  7 11:37:47 lyonas ada3: Command Queueing enabled
Aug  7 11:37:47 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug  7 11:37:47 lyonas ada3: quirks=0x1<4K>
Aug  7 11:37:47 lyonas ada3: Previously was known as ad12
Aug  7 11:37:47 lyonas ZFS: vdev state changed, pool_guid=13987490016491561389 vdev_guid=13611940111043943577
Aug  7 11:42:07 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 11:42:07 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 12:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170807.1200-2w"
Aug  7 12:42:16 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 12:42:16 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 13:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170807.1300-2w"
Aug  7 13:42:25 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 13:42:25 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 14:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170807.1400-2w"
Aug  7 14:42:33 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  7 14:42:33 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  7 14:42:59 lyonas kernel: arp: 192.168.1.11 moved from 02:ff:a0:00:05:0a to 00:1b:21:91:e6:f6 on epair0b



This is the last email alert I received on the 6th:

Code:
lyonas.local changes in mounted filesystems:
4a5
> /mnt/HOWITZER/KIDSTV  /mnt/HOWITZER/jails/plexmediaserver_1/media/KIDSTV nullfs	   rw			  0 0
22a24
> HOWITZER/KIDSTV			   /mnt/HOWITZER/KIDSTV	zfs	 rw,nfsv4acls	0 0

lyonas.local kernel log messages:
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
> GEOM_ELI: Device ada3p1.eli destroyed.
> GEOM_ELI: Detached ada3p1.eli on last close.
> (ada3:ahcich4:0:0:0): Periph destroyed
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
> ada3: Serial Number S3013JZR
> ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada3: Command Queueing enabled
> ada3: 3815447MB (7814037168 512 byte sectors)
> ada3: quirks=0x1<4K>
> ada3: Previously was known as ad12
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
> (ada3:ahcich4:0:0:0): Periph destroyed
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
> ada3: Serial Number S3013JZR
> ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada3: Command Queueing enabled
> ada3: 3815447MB (7814037168 512 byte sectors)
> ada3: quirks=0x1<4K>
> ada3: Previously was known as ad12
> arp: 192.168.1.11 moved from 02:ff:a0:00:05:0a to 00:1b:21:91:e6:f6 on epair0b
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
> (ada3:ahcich4:0:0:0): Periph destroyed
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
> ada3: Serial Number S3013JZR
> ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada3: Command Queueing enabled
> ada3: 3815447MB (7814037168 512 byte sectors)
> ada3: quirks=0x1<4K>
> ada3: Previously was known as ad12
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
> (ada3:ahcich4:0:0:0): Periph destroyed
> ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
> ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
> ada3: Serial Number S3013JZR
> ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
> ada3: Command Queueing enabled
> ada3: 3815447MB (7814037168 512 byte sectors)
> ada3: quirks=0x1<4K>
> ada3: Previously was known as ad12

-- End of security output --


Troubleshooting steps:

Replace hardware: HDD, SATA cable, Mainboard/CPU
Restore FreeNas defaults
Scrub
Memtest (OK)

Any suggestions would be great! As I have stated, the device in question has been changed out with one of equal size. I don't think I would be seeing the same behavior with two different devices? Could I have a bad PSU?
 

nojohnny101

Wizard
Joined
Dec 3, 2015
Messages
1,478
Have you run smart (long and short) tests on the offending drive? Are you sure it is the same drive (tracked by serial number)?

Can you post the smart output for the drive?
 

NadoNate

Dabbler
Joined
Aug 7, 2017
Messages
18
Serial #'s verified, here's the output:

Code:
=== START OF INFORMATION SECTION ===
Model Family:	 Seagate Desktop HDD.15
Device Model:	 ST4000DM000-1F2168
Serial Number:	S3013JZR
LU WWN Device Id: 5 000c50 07fb44982
Firmware Version: CC54
User Capacity:	4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:	 512 bytes logical, 4096 bytes physical
Rotation Rate:	5900 rpm
Form Factor:	  3.5 inches
Device is:		In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:	Mon Aug  7 16:57:01 2017 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
										was never started.
										Auto Offline Data Collection: Disabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  138) seconds.
Offline data collection
capabilities:					(0x73) SMART execute Offline immediate.
										Auto Offline data collection on/off support.
										Suspend Offline collection upon new
										command.
										No Offline surface scan supported.
										Self-test supported.
										Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   1) minutes.
Extended self-test routine
recommended polling time:		( 515) minutes.
Conveyance self-test routine
recommended polling time:		(   2) minutes.
SCT capabilities:			  (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000f   110   099   006	Pre-fail  Always	   -	   28893296
  3 Spin_Up_Time			0x0003   097   091   000	Pre-fail  Always	   -	   0
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -	   81
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -	   0
  7 Seek_Error_Rate		 0x000f   072   060   030	Pre-fail  Always	   -	   43120084863
  9 Power_On_Hours		  0x0032   082   082   000	Old_age   Always	   -	   16026
 10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -	   0
 12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -	   75
183 Runtime_Bad_Block	   0x0032   100   100   000	Old_age   Always	   -	   0
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -	   0
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -	   0
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always	   -	   0 0 0
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -	   0
190 Airflow_Temperature_Cel 0x0022   075   065   045	Old_age   Always	   -	   25 (Min/Max 25/25)
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -	   0
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -	   46
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -	   149
194 Temperature_Celsius	 0x0022   025   040   000	Old_age   Always	   -	   25 (0 9 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -	   0
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -	   0
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -	   0
240 Head_Flying_Hours	   0x0000   100   253   000	Old_age   Offline	  -	   16024h+46m+56.317s
241 Total_LBAs_Written	  0x0000   100   253   000	Old_age   Offline	  -	   30293138592
242 Total_LBAs_Read		 0x0000   100   253   000	Old_age   Offline	  -	   198551321142

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline	   Completed without error	   00%	 16026		 -
# 2  Short offline	   Completed without error	   00%	 15250		 -
# 3  Short offline	   Completed without error	   00%	 14601		 -
# 4  Short offline	   Completed without error	   00%	 13855		 -
# 5  Short offline	   Completed without error	   00%	 13135		 -
# 6  Short offline	   Completed without error	   00%	 12394		 -
# 7  Short offline	   Completed without error	   00%	 11660		 -
# 8  Short offline	   Completed without error	   00%	 10940		 -
# 9  Short offline	   Completed without error	   00%	 10461		 -
#10  Short offline	   Completed without error	   00%	  9741		 -
#11  Short offline	   Completed without error	   00%	  8997		 -
#12  Short offline	   Completed without error	   00%	  8302		 -
#13  Short offline	   Completed without error	   00%	  7558		 -
#14  Short offline	   Completed without error	   00%	  6813		 -
#15  Short offline	   Completed without error	   00%	  6107		 -
#16  Short offline	   Completed without error	   00%	  5386		 -
#17  Short offline	   Completed without error	   00%	  4667		 -
#18  Short offline	   Completed without error	   00%	  3922		 -
#19  Short offline	   Completed without error	   00%	  3179		 -
#20  Short offline	   Completed without error	   00%	  2478		 -
#21  Short offline	   Completed without error	   00%	  1781		 -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):



Seems like I haven't run a long test since I replaced the drive. I can do that tonight...
 

NadoNate

Dabbler
Joined
Aug 7, 2017
Messages
18
I didn't have time to post the results of the long test before work.

I ran it all night, when I checked this morning it still had 10% remaining after 9 hours.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Not too weird, let it run a little longer. You'll know it's a problem for sure at 24 hours. Even 12, really.
 

NadoNate

Dabbler
Joined
Aug 7, 2017
Messages
18
Ok, just got back. Here is the output from the extended test:

Code:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
										was never started.
										Auto Offline Data Collection: Disabled.
Self-test execution status:	  (   0) The previous self-test routine completed
										without error or no self-test has ever
										been run.
Total time to complete Offline
data collection:				(  138) seconds.
Offline data collection
capabilities:					(0x73) SMART execute Offline immediate.
										Auto Offline data collection on/off supp														 ort.
										Suspend Offline collection upon new
										command.
										No Offline surface scan supported.
										Self-test supported.
										Conveyance Self-test supported.
										Selective Self-test supported.
SMART capabilities:			(0x0003) Saves SMART data before entering
										power-saving mode.
										Supports SMART auto save timer.
Error logging capability:		(0x01) Error logging supported.
										General Purpose Logging supported.
Short self-test routine
recommended polling time:		(   1) minutes.
Extended self-test routine
recommended polling time:		( 515) minutes.
Conveyance self-test routine
recommended polling time:		(   2) minutes.
SCT capabilities:			  (0x1085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME		  FLAG	 VALUE WORST THRESH TYPE	  UPDATED  WHEN_														 FAILED RAW_VALUE
  1 Raw_Read_Error_Rate	 0x000f   116   099   006	Pre-fail  Always	   -																106131584
  3 Spin_Up_Time			0x0003   098   091   000	Pre-fail  Always	   -																0
  4 Start_Stop_Count		0x0032   100   100   020	Old_age   Always	   -																85
  5 Reallocated_Sector_Ct   0x0033   100   100   010	Pre-fail  Always	   -																0
  7 Seek_Error_Rate		 0x000f   071   060   030	Pre-fail  Always	   -																56005373192
  9 Power_On_Hours		  0x0032   082   082   000	Old_age   Always	   -																16098
10 Spin_Retry_Count		0x0013   100   100   097	Pre-fail  Always	   -																0
12 Power_Cycle_Count	   0x0032   100   100   020	Old_age   Always	   -																75
183 Runtime_Bad_Block	   0x0032   100   100   000	Old_age   Always	   -																0
184 End-to-End_Error		0x0032   100   100   099	Old_age   Always	   -																0
187 Reported_Uncorrect	  0x0032   100   100   000	Old_age   Always	   -																0
188 Command_Timeout		 0x0032   100   100   000	Old_age   Always	   -																0 0 0
189 High_Fly_Writes		 0x003a   100   100   000	Old_age   Always	   -																0
190 Airflow_Temperature_Cel 0x0022   074   065   045	Old_age   Always	   -																26 (Min/Max 25/26)
191 G-Sense_Error_Rate	  0x0032   100   100   000	Old_age   Always	   -																0
192 Power-Off_Retract_Count 0x0032   100   100   000	Old_age   Always	   -																44
193 Load_Cycle_Count		0x0032   100   100   000	Old_age   Always	   -																1377
194 Temperature_Celsius	 0x0022   026   040   000	Old_age   Always	   -																26 (0 9 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000	Old_age   Always	   -																0
198 Offline_Uncorrectable   0x0010   100   100   000	Old_age   Offline	  -																0
199 UDMA_CRC_Error_Count	0x003e   200   200   000	Old_age   Always	   -																0
240 Head_Flying_Hours	   0x0000   100   253   000	Old_age   Offline	  -																16089h+21m+53.031s
241 Total_LBAs_Written	  0x0000   100   253   000	Old_age   Offline	  -																30382655072
242 Total_LBAs_Read		 0x0000   100   253   000	Old_age   Offline	  -																198572856859

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description	Status				  Remaining  LifeTime(hours)  LBA														 _of_first_error
# 1  Extended offline	Completed without error	   00%	 16038		 -
# 2  Short offline	   Completed without error	   00%	 16026		 -
# 3  Short offline	   Completed without error	   00%	 15250		 -
# 4  Short offline	   Completed without error	   00%	 14601		 -
# 5  Short offline	   Completed without error	   00%	 13855		 -
# 6  Short offline	   Completed without error	   00%	 13135		 -
# 7  Short offline	   Completed without error	   00%	 12394		 -
# 8  Short offline	   Completed without error	   00%	 11660		 -
# 9  Short offline	   Completed without error	   00%	 10940		 -
#10  Short offline	   Completed without error	   00%	 10461		 -
#11  Short offline	   Completed without error	   00%	  9741		 -
#12  Short offline	   Completed without error	   00%	  8997		 -
#13  Short offline	   Completed without error	   00%	  8302		 -
#14  Short offline	   Completed without error	   00%	  7558		 -
#15  Short offline	   Completed without error	   00%	  6813		 -
#16  Short offline	   Completed without error	   00%	  6107		 -
#17  Short offline	   Completed without error	   00%	  5386		 -
#18  Short offline	   Completed without error	   00%	  4667		 -
#19  Short offline	   Completed without error	   00%	  3922		 -
#20  Short offline	   Completed without error	   00%	  3179		 -
#21  Short offline	   Completed without error	   00%	  2478		 -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
	1		0		0  Not_testing
	2		0		0  Not_testing
	3		0		0  Not_testing
	4		0		0  Not_testing
	5		0		0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Checked the logs, looks like its still happening...

Code:
Aug  9 12:04:40 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  9 12:04:40 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug  9 12:04:40 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug  9 12:04:46 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug  9 12:04:46 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug  9 12:04:46 lyonas ada3: Serial Number S3013JZR
Aug  9 12:04:46 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug  9 12:04:46 lyonas ada3: Command Queueing enabled
Aug  9 12:04:46 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug  9 12:04:46 lyonas ada3: quirks=0x1<4K>
Aug  9 12:04:46 lyonas ada3: Previously was known as ad12
Aug  9 12:04:47 lyonas ZFS: vdev state changed, pool_guid=13987490016491561389 vdev_guid=13611940111043943577
Aug  9 12:49:10 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 12:49:10 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 13:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170809.1300-2w"
Aug  9 13:49:19 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 13:49:19 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 14:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170809.1400-2w"
Aug  9 14:49:28 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 14:49:28 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 15:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170809.1500-2w"
Aug  9 15:49:37 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 15:49:37 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 16:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170809.1600-2w"
Aug  9 16:49:47 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 16:49:47 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 17:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170809.1700-2w"
Aug  9 17:49:55 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 17:49:55 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 18:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170809.1800-2w"
Aug  9 18:50:04 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 18:50:04 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 19:50:16 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 19:50:16 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 20:50:26 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 20:50:26 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 21:50:35 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 21:50:35 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 22:50:43 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 22:50:43 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug  9 23:50:50 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug  9 23:50:50 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 00:00:00 lyonas syslog-ng[1480]: Configuration reload request received, reloading configuration;
Aug 10 00:50:57 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 00:50:57 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 01:51:05 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 01:51:05 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 02:51:12 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 02:51:12 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 03:51:20 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 03:51:20 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 04:51:28 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 04:51:28 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 05:51:38 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 05:51:38 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 06:51:48 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 06:51:48 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 07:51:57 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 07:51:57 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 08:52:05 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 08:52:05 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 09:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170810.0900-2w"
Aug 10 09:52:14 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 09:52:14 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 10:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170810.1000-2w"
Aug 10 10:52:22 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 10:52:22 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 11:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170810.1100-2w"
Aug 10 11:52:30 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Aug 10 11:52:30 lyonas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Aug 10 12:00:01 lyonas autosnap.py: [tools.autosnap:66] Popen()ing: /sbin/zfs snapshot "HOWITZER@auto-20170810.1200-2w"
Aug 10 12:23:31 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug 10 12:23:31 lyonas ada3: <ST4000DM000-1F2168 CC54> s/n S3013JZR detached
Aug 10 12:23:31 lyonas (ada3:ahcich4:0:0:0): Periph destroyed
Aug 10 12:23:37 lyonas ada3 at ahcich4 bus 0 scbus4 target 0 lun 0
Aug 10 12:23:37 lyonas ada3: <ST4000DM000-1F2168 CC54> ACS-2 ATA SATA 3.x device
Aug 10 12:23:37 lyonas ada3: Serial Number S3013JZR
Aug 10 12:23:37 lyonas ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Aug 10 12:23:37 lyonas ada3: Command Queueing enabled
Aug 10 12:23:37 lyonas ada3: 3815447MB (7814037168 512 byte sectors)
Aug 10 12:23:37 lyonas ada3: quirks=0x1<4K>
Aug 10 12:23:37 lyonas ada3: Previously was known as ad12
 

NadoNate

Dabbler
Joined
Aug 7, 2017
Messages
18
Alright, I think I got it:

yiSSujr.jpg


You are free to chastise me about using consumer grade hardware :confused: ;)

Seems someone before me identified a continuity issue... I re-routed my power bypassing the bad cable & my issue seems to be resolved!

Thanks @Ericloewe & @nojohnny101 for your suggestions. Mod can mark this thread as SOLVED
 
Status
Not open for further replies.
Top