SOLVED FreeNAS down ...

Status
Not open for further replies.

deju

Cadet
Joined
Jun 22, 2015
Messages
5
Hi all together,

I'm using freenas for a longer time. And this project is great. But ..

Yesterday my Freenas informed me per email about:
The volume Vol1RaidZ (ZFS) state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.

-> I think the problem starts at: May 30 09:39:38

and an hour later:
The volume Vol1RaidZ (ZFS) state is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.


At the server, the system totally hangs, so tried to reboot (with i/o button). Now it is impossible to boot the normal system. The booting phase is broken and stopped in the "DB mode"

The Unix prompt like:

Code:
DB>


The only possibility I found, was removing one of my three RaidZ disks and the systems cames up with an DEGRADED zpool.

A zpool list shows the following:
Code:
[root@backup01] ~# zpool import
   pool: Vol1RaidZ
     id: 3781705845334976039
  state: DEGRADED
status: One or more devices are missing from the system.
action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
   see: http://illumos.org/msg/ZFS-8000-2Q
config:

        Vol1RaidZ                                       DEGRADED
          raidz1-0                                      DEGRADED
            gptid/ecdbc9f7-0ee7-11e5-a8a2-d8d385af799b  ONLINE
            gptid/eda61235-0ee7-11e5-a8a2-d8d385af799b  ONLINE
            8979334234234281760                         UNAVAIL  cannot open
[root@backup01] ~#


// makes sense, because one disk was removed.

I don't have an idea.

Enclosed some logs:

/var/log/message
Code:
May 30 00:00:00 backup01 newsyslog[88366]: logfile turned over due to size>100K
May 30 00:00:01 backup01 syslog-ng[1588]: Configuration reload request received, reloading configuration;
...
May 30 03:51:29 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 03:51:48 backup01 ahcich1: Timeout on slot 1 port 0
May 30 03:51:48 backup01 ahcich1: is 00000008 cs 00000000 ss 00000000 rs 00000002 tfd 40 serr 00000000 cmd 00006117
May 30 03:51:48 backup01 (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 f8 ef 06 40 43 00 00 00 00 00
May 30 03:51:48 backup01 (ada1:ahcich1:0:0:0): CAM status: Command timeout
May 30 03:51:48 backup01 (ada1:ahcich1:0:0:0): Retrying command
May 30 03:52:30 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
...
May 30 08:59:52 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 09:00:07 backup01 autosnap.py: [tools.autosnap:71] Popen()ing: /sbin/zfs snapshot "Vol1RaidZ/SnapDaily@auto-20160530.0900-60d"
May 30 09:00:08 backup01 autosnap.py: [tools.autosnap:71] Popen()ing: /sbin/zfs destroy -r -d "Vol1RaidZ/SnapDaily@auto-20160331.0900-60d"
May 30 09:00:53 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
...
May 30 09:38:45 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 09:39:38 backup01 ahcich1: Timeout on slot 27 port 0
May 30 09:39:38 backup01 ahcich1: is 00000008 cs 00000000 ss 00000000 rs 08000000 tfd 40 serr 00000000 cmd 00007b17
May 30 09:39:38 backup01 (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 80 68 37 40 3a 01 00 00 00 00
May 30 09:39:38 backup01 (ada1:ahcich1:0:0:0): CAM status: Command timeout
May 30 09:39:38 backup01 (ada1:ahcich1:0:0:0): Retrying command
May 30 09:39:46 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 09:40:08 backup01 ahcich1: Timeout on slot 27 port 0
May 30 09:40:08 backup01 ahcich1: is 00000002 cs 00000000 ss 00000000 rs 08000000 tfd 50 serr 00000000 cmd 00007b17
May 30 09:40:08 backup01 (aprobe0:ahcich1:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
May 30 09:40:08 backup01 (aprobe0:ahcich1:0:0:0): CAM status: Command timeout
May 30 09:40:08 backup01 (aprobe0:ahcich1:0:0:0): Error 5, Retry was blocked
May 30 09:40:39 backup01 ahcich1: Timeout on slot 27 port 0
May 30 09:40:39 backup01 ahcich1: is 00000002 cs 00000000 ss 00000000 rs 08000000 tfd 50 serr 00000000 cmd 00007b17
May 30 09:40:39 backup01 (aprobe0:ahcich1:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00
May 30 09:40:39 backup01 (aprobe0:ahcich1:0:0:0): CAM status: Command timeout
May 30 09:40:39 backup01 (aprobe0:ahcich1:0:0:0): Error 5, Retry was blocked
May 30 09:40:39 backup01 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
May 30 09:40:39 backup01 ada1: <WDC WD30EFRX-68EUZN0 82.00A82> s/n WD-WMC4N0J74762 detached
May 30 09:40:39 backup01 GEOM_ELI: g_eli_read_done() failed ada1p1.eli[READ(offset=4681728, length=4096)]
May 30 09:40:39 backup01 swap_pager: I/O error - pagein failed; blkno 1143,size 4096, error 6
May 30 09:40:39 backup01 vm_fault: pager read error, pid 1207 (devd)
May 30 09:40:39 backup01 GEOM_ELI: g_eli_read_done() failed ada1p1.eli[READ(offset=7057408, length=4096)]
May 30 09:40:39 backup01 swap_pager: I/O error - pagein failed; blkno 1723,size 4096, error 6
May 30 09:40:39 backup01 vm_fault: pager read error, pid 1207 (devd)
May 30 09:40:39 backup01 GEOM_ELI: g_eli_read_done() failed ada1p1.eli[READ(offset=7057408, length=4096)]
May 30 09:40:39 backup01 swap_pager: I/O error - pagein failed; blkno 1723,size 4096, error 6
May 30 09:40:39 backup01 vm_fault: pager read error, pid 1207 (devd)
May 30 09:40:39 backup01 GEOM_ELI: Device ada1p1.eli destroyed.
May 30 09:40:39 backup01 GEOM_ELI: Detached ada1p1.eli on last close.
May 30 09:40:39 backup01 swap_pager: I/O error - pagein failed; blkno 1724,size 4096, error 6
May 30 09:40:39 backup01 vm_fault: pager read error, pid 1207 (devd)
May 30 09:40:39 backup01 kernel: Failed to write core file for process devd (error 14)
May 30 09:40:39 backup01 kernel: Failed to write core file for process devd (error 14)
May 30 09:40:39 backup01 kernel: pid 1207 (devd), uid 0: exited on signal 11
May 30 09:40:39 backup01 (ada1:ahcich1:0:0:0): Periph destroyed
May 30 09:40:47 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 09:41:04 backup01 swap_pager: I/O error - pagein failed; blkno 7202,size 8192, error 6
May 30 09:41:04 backup01 vm_fault: pager read error, pid 2617 (python2.7)
May 30 09:41:04 backup01 kernel: pid 2617 (python2.7), uid 0: exited on signal 11
May 30 09:41:48 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
...
May 30 09:50:59 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 09:51:22 backup01 swap_pager: I/O error - pagein failed; blkno 3341,size 12288, error 6
May 30 09:51:22 backup01 vm_fault: pager read error, pid 2331 (smartd)
May 30 09:51:22 backup01 swap_pager: I/O error - pagein failed; blkno 3341,size 12288, error 6
May 30 09:51:22 backup01 vm_fault: pager read error, pid 2331 (smartd)
May 30 09:51:22 backup01 kernel: Failed to write core file for process smartd (error 14)
May 30 09:51:22 backup01 kernel: Failed to write core file for process smartd (error 14)
May 30 09:51:22 backup01 kernel: pid 2331 (smartd), uid 0: exited on signal 11
May 30 09:52:00 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
...
May 30 10:59:21 backup01 mountd[1770]: mount request denied from 192.168.100.100 for /mnt/Vol1RaidZ/Normal/isos
May 30 11:02:58 backup01 syslog-ng[1588]: syslog-ng starting up; version='3.5.6'
May 30 11:02:58 backup01 Copyright (c) 1992-2014 The FreeBSD Project.
May 30 11:02:58 backup01 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
May 30 11:02:58 backup01     The Regents of the University of California. All rights reserved.
May 30 11:02:58 backup01 FreeBSD is a registered trademark of The FreeBSD Foundation.
May 30 11:02:58 backup01 FreeBSD 9.3-RELEASE-p31 #0 r288272+33bb475: Wed Feb  3 02:19:35 PST 2016
May 30 11:02:58 backup01 root@build3.ixsystems.com:/tank/home/stable-builds/FN/objs/os-base/amd64/tank/home/stable-builds/FN/FreeBSD/src/sys/FREENAS.amd64 amd64
May 30 11:02:58 backup01 gcc version 4.2.1 20070831 patched [FreeBSD]
May 30 11:02:58 backup01 CPU: AMD Athlon(tm) II Neo N36L Dual-Core Processor (1297.87-MHz K8-class CPU)
May 30 11:02:58 backup01 Origin = "AuthenticAMD"  Id = 0x100f63  Family = 0x10  Model = 0x6  Stepping = 3
May 30 11:02:58 backup01 Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
May 30 11:02:58 backup01 Features2=0x802009<SSE3,MON,CX16,POPCNT>
May 30 11:02:58 backup01 AMD Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
May 30 11:02:58 backup01 AMD Features2=0x8377f<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,Prefetch,OSVW,IBS,SKINIT,WDT,NodeId>
May 30 11:02:58 backup01 TSC: P-state invariant
May 30 11:02:58 backup01 real memory  = 6174015488 (5888 MB)
May 30 11:02:58 backup01 avail memory = 4589174784 (4376 MB)
May 30 11:02:58 backup01 Event timer "LAPIC" quality 400
May 30 11:02:58 backup01 ACPI APIC Table: <HP     ProLiant>
May 30 11:02:58 backup01 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
May 30 11:02:58 backup01 FreeBSD/SMP: 1 package(s) x 2 core(s)
May 30 11:02:58 backup01 cpu0 (BSP): APIC ID:  0
May 30 11:02:58 backup01 cpu1 (AP): APIC ID:  1
May 30 11:02:58 backup01 WARNING: VIMAGE (virtualized network stack) is a highly experimental feature.
May 30 11:02:58 backup01 ioapic0 <Version 2.1> irqs 0-23 on motherboard
May 30 11:02:58 backup01 kbd1 at kbdmux0
May 30 11:02:58 backup01 cryptosoft0: <software crypto> on motherboard
May 30 11:02:58 backup01 aesni0: No AESNI support.
May 30 11:02:58 backup01 padlock0: No ACE support.
May 30 11:02:58 backup01 acpi0: <HP ProLiant> on motherboard
May 30 11:02:58 backup01 acpi0: Power Button (fixed)
May 30 11:02:58 backup01 acpi0: reservation of fee00000, 1000 (3) failed
May 30 11:02:58 backup01 acpi0: reservation of ffb80000, 80000 (3) failed
May 30 11:02:58 backup01 acpi0: reservation of fec10000, 20 (3) failed
May 30 11:02:58 backup01 acpi0: reservation of fed80000, 1000 (3) failed
May 30 11:02:58 backup01 acpi0: reservation of 0, a0000 (3) failed
May 30 11:02:58 backup01 acpi0: reservation of 100000, aff00000 (3) failed
May 30 11:02:58 backup01 cpu0: <ACPI CPU> on acpi0
May 30 11:02:58 backup01 cpu1: <ACPI CPU> on acpi0
May 30 11:02:58 backup01 attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
May 30 11:02:58 backup01 Timecounter "i8254" frequency 1193182 Hz quality 0
May 30 11:02:58 backup01 Event timer "i8254" frequency 1193182 Hz quality 100
May 30 11:02:58 backup01 atrtc0: <AT realtime clock> port 0x70-0x71 irq 8 on acpi0
May 30 11:02:58 backup01 Event timer "RTC" frequency 32768 Hz quality 0
May 30 11:02:58 backup01 hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
May 30 11:02:58 backup01 Timecounter "HPET" frequency 14318180 Hz quality 950
May 30 11:02:58 backup01 Event timer "HPET" frequency 14318180 Hz quality 550
May 30 11:02:58 backup01 Event timer "HPET1" frequency 14318180 Hz quality 450
May 30 11:02:58 backup01 Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
May 30 11:02:58 backup01 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
May 30 11:02:58 backup01 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
May 30 11:02:58 backup01 pci0: <ACPI PCI bus> on pcib0
May 30 11:02:58 backup01 pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
May 30 11:02:58 backup01 pci1: <ACPI PCI bus> on pcib1
May 30 11:02:58 backup01 vgapci0: <VGA-compatible display> port 0xe000-0xe0ff mem 0xd0000000-0xdfffffff,0xfe8f0000-0xfe8fffff,0xfe700000-0xfe7fffff irq 18 at device 5.0 on pci1
May 30 11:02:58 backup01 vgapci0: Boot video device
May 30 11:02:58 backup01 pcib2: <ACPI PCI-PCI bridge> irq 18 at device 6.0 on pci0
May 30 11:02:58 backup01 pci2: <ACPI PCI bus> on pcib2
May 30 11:02:58 backup01 bge0: <HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100> mem 0xfe9f0000-0xfe9fffff irq 18 at device 0.0 on pci2
May 30 11:02:58 backup01 bge0: CHIP ID 0x05784100; ASIC REV 0x5784; CHIP REV 0x57841; PCI-E
May 30 11:02:58 backup01 miibus0: <MII bus> on bge0
May 30 11:02:58 backup01 brgphy0: <BCM5784 10/100/1000baseT PHY> PHY 1 on miibus0
May 30 11:02:58 backup01 brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow
May 30 11:02:58 backup01 bge0: Ethernet address: d8:d3:85:af:79:9b
May 30 11:02:58 backup01 ahci0: <AMD SB7x0/SB8x0/SB9x0 AHCI SATA controller> port 0xd000-0xd007,0xc000-0xc003,0xb000-0xb007,0xa000-0xa003,0x9000-0x900f mem 0xfe6ffc00-0xfe6fffff irq 19 at device 17.0 on pci0
May 30 11:02:58 backup01 ahci0: AHCI v1.20 with 4 3Gbps ports, Port Multiplier supported
May 30 11:02:58 backup01 ahcich0: <AHCI channel> at channel 0 on ahci0
May 30 11:02:58 backup01 ahcich1: <AHCI channel> at channel 1 on ahci0
May 30 11:02:58 backup01 ahcich2: <AHCI channel> at channel 2 on ahci0
May 30 11:02:58 backup01 ahcich3: <AHCI channel> at channel 3 on ahci0
May 30 11:02:58 backup01 ohci0: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xfe6fe000-0xfe6fefff irq 18 at device 18.0 on pci0
May 30 11:02:58 backup01 usbus0 on ohci0
May 30 11:02:58 backup01 ehci0: <AMD SB7x0/SB8x0/SB9x0 USB 2.0 controller> mem 0xfe6ff800-0xfe6ff8ff irq 17 at device 18.2 on pci0
May 30 11:02:58 backup01 usbus1: EHCI version 1.0
May 30 11:02:58 backup01 usbus1 on ehci0
May 30 11:02:58 backup01 ohci1: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xfe6fd000-0xfe6fdfff irq 18 at device 19.0 on pci0
May 30 11:02:58 backup01 usbus2 on ohci1
May 30 11:02:58 backup01 ehci1: <AMD SB7x0/SB8x0/SB9x0 USB 2.0 controller> mem 0xfe6ff400-0xfe6ff4ff irq 17 at device 19.2 on pci0
May 30 11:02:58 backup01 usbus3: EHCI version 1.0
May 30 11:02:58 backup01 usbus3 on ehci1
May 30 11:02:58 backup01 atapci0: <ATI IXP700/800 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 20.1 on pci0
May 30 11:02:58 backup01 ata0: <ATA channel> at channel 0 on atapci0
May 30 11:02:58 backup01 ata1: <ATA channel> at channel 1 on atapci0
May 30 11:02:58 backup01 isab0: <PCI-ISA bridge> at device 20.3 on pci0
May 30 11:02:58 backup01 isa0: <ISA bus> on isab0
May 30 11:02:58 backup01 pcib3: <ACPI PCI-PCI bridge> at device 20.4 on pci0
May 30 11:02:58 backup01 pci3: <ACPI PCI bus> on pcib3
May 30 11:02:58 backup01 ohci2: <AMD SB7x0/SB8x0/SB9x0 USB controller> mem 0xfe6fc000-0xfe6fcfff irq 18 at device 22.0 on pci0





Please have a look at my files and maybe someone can help me.

What do you need more?

T H A N K Y O U in advance.
Juergen

PS: sorry for this real long post.
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215

deju

Cadet
Joined
Jun 22, 2015
Messages
5
Thank you for your prompt answer.

You are talking from this device?

Code:
ada1: <WDC WD30EFRX-68EUZN0 82.00A82> s/n WD-WMC4N0J74762 detached
May 30 09:40:39 backup01 GEOM_ELI: g_eli_read_done() failed ada1p1.eli[READ(offset=4681728, length=4096)]
 

Mirfster

Doesn't know what he's talking about
Joined
Oct 2, 2015
Messages
3,215
Yes, you can always try to add that drive back to the pool and/or just to get SMART status of it. However, since you are running only a RaidZ1 your pool is in danger since it can only withstand losing one drive. Personally, if I were in your situation I would opt to get another drive in there that is "known good". You can always check out the drive having issues later, when your pool is back up to full health.
 

Sakuru

Guru
Joined
Nov 20, 2015
Messages
527

deju

Cadet
Joined
Jun 22, 2015
Messages
5
Thank you all for your assistance.
I will buy some RAM and try again.

At the moment I'm testing with smartctl, but the first results looks like Old_age and not like errors.
 

deju

Cadet
Joined
Jun 22, 2015
Messages
5
The RAM is in, the system is running, but there are still some problems regarding the zfs Volume.

I have currently started a scrub.
Code:
zpool scrub Volume


Regarding the smartctl outputs:
Enclosed my stati:

smartctl -a /dev/ada1
Code:
SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   181   180   021    Pre-fail  Always       -       5941
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       20
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       8550
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       20
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       11
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       129
194 Temperature_Celsius     0x0022   122   114   000    Old_age   Always       -       28
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      8542         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


smartctl -a /dev/ada2
Code:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   183   180   021    Pre-fail  Always       -       5841
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       17
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       8534
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       17
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       9
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       125
194 Temperature_Celsius     0x0022   122   116   000    Old_age   Always       -       28
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


smartctl -a /dev/ada3
Code:
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   180   179   021    Pre-fail  Always       -       6000
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       19
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   089   089   000    Old_age   Always       -       8549
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       19
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       10
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       122
194 Temperature_Celsius     0x0022   122   115   000    Old_age   Always       -       28
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      8541         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


So for me disk 1 and 3 seems to be ok (not new but ok). At disk 2 I have to run the test.

This is the state.

Thank you for your support
Juergen
 

deju

Cadet
Joined
Jun 22, 2015
Messages
5
.... magic :- )

Yesterday I had started the scrub of my volume.

One HD was degraded,
round about 200 errors was listed at the state.

So I shut down the server.
Now I have restarted my server and the alert sign at the top right of my FreeNAS Gui is lighting green.

zpool status -v
Code:
  pool: Vol1RaidZ
state: ONLINE
  scan: scrub repaired 7.64G in 7h53m with 0 errors on Fri Jun  3 01:49:43 2016
config:

        NAME                                            STATE     READ WRITE CKSUM
        Vol1RaidZ                                       ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            gptid/ecdbc9f7-0ee7-11e5-a8a2-d8d385af799b  ONLINE       0     0     0
            gptid/eda61235-0ee7-11e5-a8a2-d8d385af799b  ONLINE       0     0     0
            gptid/ee740ae3-0ee7-11e5-a8a2-d8d385af799b  ONLINE       0     0     0

errors: No known data errors


For me it seems that everything is fine!?!.

Thank you all for your support.
Juergen
 

Bidule0hm

Server Electronics Sorcerer
Joined
Aug 5, 2013
Messages
3,710
Checksum error counts are reinitialized on reboot.
 
Status
Not open for further replies.
Top