Can I delete these files causing errors?

csh8428

Dabbler
Joined
Nov 5, 2012
Messages
45
I see this error One or more devices has experienced an error resulting in data corruption. in my alerts every time I boot.
Can I just delete the files deferenced in zpool status - v:
Should I do anything else?

Code:
pool: Green2TB
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 0 days 06:24:44 with 4 errors on Sun Nov 18 09:24:46 2018
config:

        NAME                                          STATE     READ WRITE CKSUM
        Green2TB                                      ONLINE       0     0    12
          gptid/efb2c03f-1497-11e5-9074-8c89a5ddf217  ONLINE       0     0    48

errors: Permanent errors have been detected in the following files:

        /mnt/Green2TB/.system/syslog-26bec1b654b54f3489f57aa43ee342f9/log
        Green2TB/.system/syslog-26bec1b654b54f3489f57aa43ee342f9:<0x2e>
        Green2TB/.system/syslog-26bec1b654b54f3489f57aa43ee342f9:<0x3e>



smartctl -a /dev/ada4 results

Code:
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD20EARX-00PASB0
Serial Number:    WD-WCAZAJ856713
LU WWN Device Id: 5 0014ee 207a6bce9
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Jan 19 14:07:09 2019 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (39660) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 382) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       1
  3 Spin_Up_Time            0x0027   240   167   021    Pre-fail  Always       -       2975
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       946
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   086   086   000    Old_age   Always       -       10342
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       747
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       239
193 Load_Cycle_Count        0x0032   125   125   000    Old_age   Always       -       227269
194 Temperature_Celsius     0x0022   121   102   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       2
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       2
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       2

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
Complete hardware list please. This setup you are running so far shows what all NOT to do when setting up FreeNAS. You are not performing regular smart tests as none have been logged since putting this drive in service. You also have a pool consisting of a single disk with no redundancy. Your WD green drive has an excessive amount of load cycles and has obviously not had the head parking attribute changed using wdidle3.

I hope you don't have any critical data on that system and if you do I hope you have a good backup.
 

csh8428

Dabbler
Joined
Nov 5, 2012
Messages
45
Complete hardware list please.
Sorry. Forgot that part. Added it to the OP.

This setup you are running so far shows what all NOT to do when setting up FreeNAS. You are not performing regular smart tests as none have been logged since putting this drive in service.
I'm not a nix guru. I've read about this stuff, but never put it into practice

You also have a pool consisting of a single disk with no redundancy.
I have 4 disks. Each is in it's own pool. I do Rsyncs to back up data from 1 disk to another every so often to back them up. I don't want all the data backed up, so that's why I have it set up that way.

Complete hardware list please.
Your WD green drive has an excessive amount of load cycles and has obviously not had the head parking attribute changed using wdidle3.
What does this mean? Is there a way to fix it going forward?
 

Heracles

Wizard
Joined
Feb 2, 2018
Messages
1,401
Hi CSH,

To have multiple pools of 1 drive is a terrible idea. The sync you do is not in real time, will not sync snapshots and many other things. With 4 drives, the minimum redundancy should be all of them in a single pool as RaidZ1. The other option would be to do the equivalent of Raid-10, doing 2 vdev of 2 drives in mirror and joining them in a single pool.

The Western Digital Green is not meant at all for being used in a NAS. It is engineered for low consumption and standalone usage. The WD Red is the one designed for NAS. Here is only one illustration of the difference between the two :
Should the WD Green have a hard time reading a sector, it will fight its drive as long as possible to get that data. It does so because it considers it is the only one capable of answering that read request. As such, the entire NAS is interrupted until the drive finally made it or give up.
Should the WD Red have a hard time reading a sector, it will return an error after only a short moment and relying on the other drives to be able to recover that data.

The setting about wdidle3 is one of the thing that makes WD Green not as bad for NAS as they are by default. Still, they are not designed nor recommended for that.

Without the smart tests mentioned above, FreeNAS will not detect when a drive is about to fail, so will not be able to flag it before it is too late. These smart tests are even more important in situations of low or no redundancy like yours. It is the last mechanism that can save your data, so it is crucial for you to activate them.

As for the complete hardware list, your description is still way short of what is required. One important element is the RAM. Below 8 Gig of RAM, FreeNAS can suffer some strange and irregular situations. As such, it is considered as a requirement to have at least these 8 Gig of RAM.

FreeNAS is very powerful. A power that is not mastered can do more bad than good. Be careful to understand and master it properly before doing anything serious with it...

Good luck,
 

toadman

Guru
Joined
Jun 4, 2013
Messages
619
I'm not a nix guru. I've read about this stuff, but never put it into practice.

No need to be a guru. It's covered in the documentation. https://www.ixsystems.com/documentation/freenas/11.2/tasks.html#s-m-a-r-t-tests

You can also search on the forums for some scripts people have written that can email you results.

I have 4 disks. Each is in it's own pool. I do Rsyncs to back up data from 1 disk to another every so often to back them up. I don't want all the data backed up, so that's why I have it set up that way.

You certainly can do it that way, though the usefulness of storage pools comes from not using single disks. Yes, you DO need backups, so no one is suggesting you don't keep backups (preferably on a different system entirely). With single disks you don't get any redundancy and your availability goes way down. Like you are seeing now, with only one disk your data can be compromised and there is no other information the system has to "correct" it.

So yes, as suggested above, the forum will encourage you to rethink your config and add redundancy to your pool via multiple disks. (Several ways to approach that depending on your use case.)


What does this mean? Is there a way to fix it going forward?

You can do a search on wdidle3 (which is an .exe) and applying to WD Green drives. But while you could/should apply it to your existing drive, with the load count shown that drive is going to have to be replaced (in my opinion).

As an immediate step, I would run the SMART short and long tests on the drive. You can schedule it with the Freenas GUI as above, or run it from the command line.

Code:
# smartctl -t short /dev/ada4
# smartctl -t long /dev/ada4


Check results after each run.
 
Top