Big trouble with DEGRADED pools and reboots

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
Hello together,

I have a big issue with my freenas. My both pools (for the os (name: freenas-boot) and for my data (name: pool)) were degraded with a lot of errors:

frenas-boot is a SSD, connect with usb3
data pool are 4 WD RED 4TB as raidz-1

Yesterday I had time for a new build. At first I change my bios battery (because of problems with the bios settings after power disconnetions)

So I upgraded my freenas from 9.10 to 11.2-U3, save my configuration, did a fresh new installation on my ssd, upload my old configuration, destroy my data pool, create a new raidz-1 data pool and create the datasets.

After this I start my rsync job from my backup server to my freenas but my freenas did many reboot itself and my rsync job died. So I did it again...and again.

So after last night my new freenas-boot pool is DEGRADED and for my new data raidz-1 pool had new Permanent errors on files:

Code:
reenas# zpool status -v
  pool: freenas-boot
state: DEGRADED
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: none requested
config:

    NAME        STATE     READ WRITE CKSUM
    freenas-boot  DEGRADED     0     0    52
      da0p2     DEGRADED     0     0   208  too many errors

errors: Permanent errors have been detected in the following files:

        freenas-boot/ROOT/default:<0x0>

  pool: pool
state: ONLINE
status: One or more devices has experienced an error resulting in data
    corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
    entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: resilvered 123M in 0 days 00:00:03 with 0 errors on Wed Apr  3 21:41:12 2019
config:

    NAME                                            STATE     READ WRITE CKSUM
    pool                                            ONLINE       0     0     0
      raidz1-0                                      ONLINE       0     0     0
        gptid/f82f1ae1-562e-11e9-adfd-6805ca28b2da  ONLINE       0     0     0
        gptid/fc12abb3-562e-11e9-adfd-6805ca28b2da  ONLINE       0     0     0
        gptid/fff16e5c-562e-11e9-adfd-6805ca28b2da  ONLINE       0     0     0
        gptid/03db71f6-562f-11e9-adfd-6805ca28b2da  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        /mnt/pool/timemachinebjoern/XXX MacBook Pro.sparsebundle/bands/303
        /mnt/pool/timemachinebjoern/XXX MacBook Pro.sparsebundle/bands/44


Here the rsync output
Code:
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: connection unexpectedly closed (464176 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(605) [sender=3.0.9]
ssh: connect to host 192.168.1.3 port 22: Connection refused


If I start my rsync job the freenas system will get a reboot.....but Why?

Specs:
8GB non ECC RAM (I know its not ECC! :( )
board Q1900-ITX from ASRock with Intel(R) Celeron(R) CPU J1900 @ 1.99GHz
Be quite PSU
4 WD RED Disc (SATA)
1 small SSD with (USB3)
Intel EXPI9301CTBLK PRO1000 Netzwerkkarte CT PCIex bulk

My system between 4 and 5 years old...

The SMART tests are ok.

Do you have any ideas?

Thanks a lot.

Regards
lnix
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
er. your boot drive has a ton of errors, but you didn't replace it? raidz1 is generally not recomended.
smart tests show ok when the drives are failing, so that doesnt really help without the actual smart out put.
 

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
Hi artlessknave, thanks for your answer. I think the errors after the fresh installation cames from the crashs of the rsync jobs. How can I check my boot drive?

I can't understand why the system will get a reboot (without shutdown) if I start my rsyncs.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
you give specs, but you dont say what the wattage of that PSU is. my first guess is that the PSU is no longer sufficient and trying to do ...anything with your disks is overloading it and everything goes to hell from there. crashes of rsync jobs should not have errors anywhere from zfs, *particularly* not the boot drive, where you should not be rsync'ing at all; zfs showing errors is a really bad thing since it doesnt change data in place and so the the only thing that should corrupt files is bad disks or connection, which either means the drives or the connection to the drives isnt stable, or the drives are failing.
 

SweetAndLow

Sweet'NASty
Joined
Nov 6, 2013
Messages
6,421
I'm also curious what made you wipe everything including your data and start over? You have some kind of hardware problem here. Post your smartctl -a output for all drives and the rest of your system specs.
 

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
today I get a new PSU (be quiet! ATX 300W System Power B9 Bulk BN206) for testing. I will post my smartctl -a output later. Also I will replace all sata cables if the new PSU doesnt work.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
be quit doesn't sound like one of the known good brands, and that wattage is not one I would recommend. all psu are not created equal.
 

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
I replace the PSU but the problem isn't solved. Yes be quit isn't a enterprise solution but my nas also.

Here the SMART Outputs:

Boot SSD Disk

Code:
freenas# smartctl -a /dev/da0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     SanDisk based SSDs
Device Model:     SanDisk SDSSDP064G
Serial Number:    130309401046
LU WWN Device Id: 5 001b44 962a0a5d6
Firmware Version: 2.0.0
User Capacity:    64,023,257,088 bytes [64.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      1.8 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Apr  6 11:55:30 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (  120) seconds.
Offline data collection
capabilities:              (0x51) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (  12) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0002   100   100   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0002   100   100   000    Old_age   Always       -       20510
 12 Power_Cycle_Count       0x0002   100   100   000    Old_age   Always       -       846
171 Program_Fail_Count      0x0002   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0002   100   100   000    Old_age   Always       -       0
173 Avg_Write/Erase_Count   0x0002   100   100   000    Old_age   Always       -       8
174 Unexpect_Power_Loss_Ct  0x0002   100   100   000    Old_age   Always       -       209
187 Reported_Uncorrect      0x0002   100   100   000    Old_age   Always       -       0
230 Perc_Write/Erase_Count  0x0002   100   100   000    Old_age   Always       -       26
232 Perc_Avail_Resrvd_Space 0x0003   100   100   005    Pre-fail  Always       -       0
234 Perc_Write/Erase_Ct_BC  0x0002   100   100   000    Old_age   Always       -       36
241 Total_LBAs_Written      0x0002   100   100   000    Old_age   Always       -       322972930
242 Total_LBAs_Read         0x0002   100   100   000    Old_age   Always       -       153642742

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Data Pool (4 WD Reds)

1. WD Red 4TB

Code:
freenas# smartctl -a /dev/ada0
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4EDYVY00K
LU WWN Device Id: 5 0014ee 25ff16fe1
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Apr  6 11:50:40 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (50880) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 509) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x703d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   177   176   021    Pre-fail  Always       -       8150
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       726
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   047   047   000    Old_age   Always       -       39085
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       96
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       48
193 Load_Cycle_Count        0x0032   195   195   000    Old_age   Always       -       16184
194 Temperature_Celsius     0x0022   120   107   000    Old_age   Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     38991         -
# 2  Short offline       Completed without error       00%     38965         -
# 3  Short offline       Completed without error       00%     38726         -
# 4  Short offline       Completed without error       00%     38491         -
# 5  Extended offline    Completed without error       00%     38255         -
# 6  Short offline       Completed without error       00%     38244         -
# 7  Short offline       Completed without error       00%     38052         -
# 8  Short offline       Completed without error       00%     37813         -
# 9  Short offline       Completed without error       00%     37575         -
#10  Short offline       Completed without error       00%     37549         -
#11  Short offline       Completed without error       00%     37162         -
#12  Extended offline    Completed without error       00%     36934         -
#13  Short offline       Completed without error       00%     36923         -
#14  Short offline       Completed without error       00%     36899         -
#15  Short offline       Completed without error       00%     36659         -
#16  Short offline       Completed without error       00%     36419         -
#17  Short offline       Completed without error       00%     36183         -
#18  Short offline       Completed without error       00%     35944         -
#19  Short offline       Completed without error       00%     35702         -
#20  Short offline       Completed without error       00%     35464         -
#21  Short offline       Completed without error       00%     35436         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


2. WD Red 4TB

Code:
freenas# smartctl -a /dev/ada1
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4ER84E2KP
LU WWN Device Id: 5 0014ee 20a9c1bda
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sat Apr  6 11:53:17 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (52080) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 521) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x703d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   180   180   021    Pre-fail  Always       -       7958
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       729
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   047   047   000    Old_age   Always       -       39084
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       98
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       50
193 Load_Cycle_Count        0x0032   195   195   000    Old_age   Always       -       16243
194 Temperature_Celsius     0x0022   120   107   000    Old_age   Always       -       32
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     38990         -
# 2  Short offline       Completed without error       00%     38964         -
# 3  Short offline       Completed without error       00%     38725         -
# 4  Short offline       Completed without error       00%     38490         -
# 5  Extended offline    Completed without error       00%     38254         -
# 6  Short offline       Completed without error       00%     38243         -
# 7  Short offline       Completed without error       00%     38051         -
# 8  Short offline       Completed without error       00%     37812         -
# 9  Short offline       Completed without error       00%     37574         -
#10  Short offline       Completed without error       00%     37549         -
#11  Short offline       Completed without error       00%     37162         -
#12  Extended offline    Completed without error       00%     36933         -
#13  Short offline       Completed without error       00%     36922         -
#14  Short offline       Completed without error       00%     36898         -
#15  Short offline       Completed without error       00%     36659         -
#16  Short offline       Completed without error       00%     36418         -
#17  Short offline       Completed without error       00%     36182         -
#18  Short offline       Completed without error       00%     35940         -
#19  Short offline       Completed without error       00%     35701         -
#20  Short offline       Completed without error       00%     35464         -
#21  Short offline       Completed without error       00%     35436         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


....
 

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
... and

3. WD Red 4TB

Code:
freenas# smartctl -a /dev/ada2
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4E48Y5930
LU WWN Device Id: 5 0014ee 20a9c22ca
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Apr  6 11:53:50 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (52020) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 520) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x703d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   177   177   021    Pre-fail  Always       -       8116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       727
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   047   047   000    Old_age   Always       -       39084
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       95
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       47
193 Load_Cycle_Count        0x0032   195   195   000    Old_age   Always       -       16094
194 Temperature_Celsius     0x0022   114   102   000    Old_age   Always       -       38
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     38990         -
# 2  Short offline       Completed without error       00%     38964         -
# 3  Short offline       Completed without error       00%     38726         -
# 4  Short offline       Completed without error       00%     38490         -
# 5  Extended offline    Completed without error       00%     38255         -
# 6  Short offline       Completed without error       00%     38243         -
# 7  Short offline       Completed without error       00%     38051         -
# 8  Short offline       Completed without error       00%     37812         -
# 9  Short offline       Completed without error       00%     37574         -
#10  Short offline       Completed without error       00%     37549         -
#11  Short offline       Completed without error       00%     37162         -
#12  Extended offline    Completed without error       00%     36933         -
#13  Short offline       Completed without error       00%     36922         -
#14  Short offline       Completed without error       00%     36898         -
#15  Short offline       Completed without error       00%     36659         -
#16  Short offline       Completed without error       00%     36419         -
#17  Short offline       Completed without error       00%     36182         -
#18  Short offline       Completed without error       00%     35943         -
#19  Short offline       Completed without error       00%     35701         -
#20  Short offline       Completed without error       00%     35464         -
#21  Short offline       Completed without error       00%     35436         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


4. WD Red 4TB

Code:
freenas# smartctl -a /dev/ada3
smartctl 6.6 2017-11-05 r4594 [FreeBSD 11.2-STABLE amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68WT0N0
Serial Number:    WD-WCC4EF87VY7J
LU WWN Device Id: 5 0014ee 20a9c2f9c
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Apr  6 11:54:21 2019 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever
                    been run.
Total time to complete Offline
data collection:         (51720) seconds.
Offline data collection
capabilities:              (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      ( 517) minutes.
Conveyance self-test routine
recommended polling time:      (   5) minutes.
SCT capabilities:            (0x703d)    SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   181   177   021    Pre-fail  Always       -       7925
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       728
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   047   047   000    Old_age   Always       -       39085
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       97
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       49
193 Load_Cycle_Count        0x0032   195   195   000    Old_age   Always       -       16282
194 Temperature_Celsius     0x0022   119   107   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     38991         -
# 2  Short offline       Completed without error       00%     38966         -
# 3  Short offline       Completed without error       00%     38727         -
# 4  Short offline       Completed without error       00%     38491         -
# 5  Extended offline    Completed without error       00%     38256         -
# 6  Short offline       Completed without error       00%     38244         -
# 7  Short offline       Completed without error       00%     38053         -
# 8  Short offline       Completed without error       00%     37813         -
# 9  Short offline       Completed without error       00%     37575         -
#10  Short offline       Completed without error       00%     37550         -
#11  Short offline       Completed without error       00%     37163         -
#12  Extended offline    Completed without error       00%     36935         -
#13  Short offline       Completed without error       00%     36923         -
#14  Short offline       Completed without error       00%     36899         -
#15  Short offline       Completed without error       00%     36660         -
#16  Short offline       Completed without error       00%     36420         -
#17  Short offline       Completed without error       00%     36183         -
#18  Short offline       Completed without error       00%     35941         -
#19  Short offline       Completed without error       00%     35702         -
#20  Short offline       Completed without error       00%     35465         -
#21  Short offline       Completed without error       00%     35437         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.



I got these messages from my NAS (mail):

freenas.local had an unscheduled system reboot.
The operating system successfully came back online at Sat Apr 6 09:26:39 2019.

Why do you think raidz1 is generally not recomended?

Regards

lnix
 

Jailer

Not strong, but bad
Joined
Sep 12, 2014
Messages
4,977
I'd start by running a check of all the individual hardware, specifically the memory. PSU is also questionable.
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
Why do you think raidz1 is generally not recommended?
long resilver times for large disks vastly increases the chances of another disk failing before the pool is rebuilt. if you have a backup server it's less of a problem, but you should know that raidz2 is recommenced for the majority of users.
Yes be quit isn't a enterprise
the PSU doesn't need to be enterprise, as there aren't really any available in standard atx form factor, but there are reliable, well built PSU's known to provide good clean stable power, and the point is to eliminate possible problems. replacing with the same hardware doesn't really do that.
nothing from those smart reports jumps out at me as being bad, and while I still recommend getting a reliable PSU with more wattage, it doesn't look like that's the problem.
specifically the memory
you can replace your sata cables, but your non ECC memory is probably the next thing to check, I didn't think of it before, but random reboots can definitely be from memory errors; get memtest86 and run it for at least one full pass, and then a full day if no errors are found (your non ECC memory needs more testing than ECC would need)
seriously consider getting recommended hardware or if freenas is the correct path for you, particularly if you arent even using replications, one of the primary advantages of zfs for backups.
problems like this are why quality hardware is pushed so hard; few people want to spend a ton of their time to troubleshoot problems that following the hardware lists usually solve.
 
Last edited:

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
Thanks a lot for your answer.

Just now it happened the next reboot. Here ist my log (/var/log/messages)

On my Console the first message was a freenas Fatal trap 12: page fault while in kernel mode

Code:
Apr  6 18:55:57 freenas syslog-ng[2006]: syslog-ng starting up; version='3.14.1'
Apr  6 18:55:57 freenas Fatal trap 12: page fault while in kernel mode
Apr  6 18:55:57 freenas cpuid = 2; apic id = 04
Apr  6 18:55:57 freenas fault virtual address    = 0x7
Apr  6 18:55:57 freenas fault code        = supervisor read data, page not present
Apr  6 18:55:57 freenas instruction pointer    = 0x20:0xffffffff80e1a610
Apr  6 18:55:57 freenas stack pointer            = 0x28:0xfffffe0222db3ad0
Apr  6 18:55:57 freenas frame pointer            = 0x28:0xfffffe0222db3b30
Apr  6 18:55:57 freenas code segment        = base 0x0, limit 0xfffff, type 0x1b
Apr  6 18:55:57 freenas             = DPL 0, pres 1, long 1, def32 0, gran 1
Apr  6 18:55:57 freenas processor eflags    = interrupt enabled, resume, IOPL = 0
Apr  6 18:55:57 freenas current process        = 15 (solthread 0xfffffff)
Apr  6 18:55:57 freenas Copyright (c) 1992-2018 The FreeBSD Project.
Apr  6 18:55:57 freenas Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
Apr  6 18:55:57 freenas     The Regents of the University of California. All rights reserved.
Apr  6 18:55:57 freenas FreeBSD is a registered trademark of The FreeBSD Foundation.
Apr  6 18:55:57 freenas FreeBSD 11.2-STABLE #0 r325575+9a3c7d8b53f(HEAD): Wed Mar 27 12:41:58 EDT 2019
Apr  6 18:55:57 freenas root@mp20.tn.ixsystems.com:/freenas-releng/freenas/_BE/objs/freenas-releng/freenas/_BE/os/sys/FreeNAS.amd64 amd64
Apr  6 18:55:57 freenas FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 6.0.0)
Apr  6 18:55:57 freenas VT(efifb): resolution 1920x1080
Apr  6 18:55:57 freenas CPU: Intel(R) Celeron(R) CPU  J1900  @ 1.99GHz (2000.05-MHz K8-class CPU)
Apr  6 18:55:57 freenas Origin="GenuineIntel"  Id=0x30678  Family=0x6  Model=0x37  Stepping=8
Apr  6 18:55:57 freenas Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
Apr  6 18:55:57 freenas Features2=0x41d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,RDRAND>
Apr  6 18:55:57 freenas AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
Apr  6 18:55:57 freenas AMD Features2=0x101<LAHF,Prefetch>
Apr  6 18:55:57 freenas Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
Apr  6 18:55:57 freenas VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
Apr  6 18:55:57 freenas TSC: P-state invariant, performance statistics
Apr  6 18:55:57 freenas real memory  = 8589934592 (8192 MB)
Apr  6 18:55:57 freenas avail memory = 7911989248 (7545 MB)
Apr  6 18:55:57 freenas Event timer "LAPIC" quality 600
Apr  6 18:55:57 freenas ACPI APIC Table: <ALASKA A M I >
Apr  6 18:55:57 freenas WARNING: L1 data cache covers less APIC IDs than a core
Apr  6 18:55:57 freenas 0 < 1
Apr  6 18:55:57 freenas FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
Apr  6 18:55:57 freenas FreeBSD/SMP: 1 package(s) x 4 core(s)
Apr  6 18:55:57 freenas WARNING: VIMAGE (virtualized network stack) is a highly experimental feature.
Apr  6 18:55:57 freenas Firmware Warning (ACPI): 32/64X length mismatch in FADT/Gpe0Block: 128/32 (20171214/tbfadt-748)
Apr  6 18:55:57 freenas ioapic0 <Version 2.0> irqs 0-86 on motherboard
Apr  6 18:55:57 freenas SMP: AP CPU #2 Launched!
Apr  6 18:55:57 freenas SMP: AP CPU #3 Launched!
Apr  6 18:55:57 freenas SMP: AP CPU #1 Launched!
Apr  6 18:55:57 freenas Timecounter "TSC" frequency 2000048184 Hz quality 1000
Apr  6 18:55:57 freenas random: entropy device external interface
Apr  6 18:55:57 freenas random: registering fast source Intel Secure Key RNG
Apr  6 18:55:57 freenas random: fast provider: "Intel Secure Key RNG"
Apr  6 18:55:57 freenas kbd1 at kbdmux0
Apr  6 18:55:57 freenas nexus0
Apr  6 18:55:57 freenas cryptosoft0: <software crypto> on motherboard
Apr  6 18:55:57 freenas aesni0: No AESNI support.
Apr  6 18:55:57 freenas padlock0: No ACE support.
Apr  6 18:55:57 freenas acpi0: <ALASKA A M I > on motherboard
Apr  6 18:55:57 freenas acpi0: Power Button (fixed)
Apr  6 18:55:57 freenas unknown: I/O range not supported
Apr  6 18:55:57 freenas cpu0: <ACPI CPU> on acpi0
Apr  6 18:55:57 freenas cpu1: <ACPI CPU> on acpi0
Apr  6 18:55:57 freenas cpu2: <ACPI CPU> on acpi0
Apr  6 18:55:57 freenas cpu3: <ACPI CPU> on acpi0
Apr  6 18:55:57 freenas atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0
Apr  6 18:55:57 freenas atrtc0: Warning: Couldn't map I/O.
Apr  6 18:55:57 freenas atrtc0: registered as a time-of-day clock, resolution 1.000000s
Apr  6 18:55:57 freenas Event timer "RTC" frequency 32768 Hz quality 0
Apr  6 18:55:57 freenas hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 on acpi0
Apr  6 18:55:57 freenas Timecounter "HPET" frequency 14318180 Hz quality 950
Apr  6 18:55:57 freenas Event timer "HPET" frequency 14318180 Hz quality 450
Apr  6 18:55:57 freenas Event timer "HPET1" frequency 14318180 Hz quality 440
Apr  6 18:55:57 freenas Event timer "HPET2" frequency 14318180 Hz quality 440
Apr  6 18:55:57 freenas attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
Apr  6 18:55:57 freenas Timecounter "i8254" frequency 1193182 Hz quality 0
Apr  6 18:55:57 freenas Event timer "i8254" frequency 1193182 Hz quality 100
Apr  6 18:55:57 freenas Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
Apr  6 18:55:57 freenas acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
Apr  6 18:55:57 freenas pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
Apr  6 18:55:57 freenas pcib0: _OSC returned error 0x4
Apr  6 18:55:57 freenas pcib0: Length mismatch for 3 range: 10916fff vs 10917000
Apr  6 18:55:57 freenas pci0: <ACPI PCI bus> on pcib0
Apr  6 18:55:57 freenas vgapci0: <VGA-compatible display> port 0xf080-0xf087 mem 0x90000000-0x903fffff,0x80000000-0x8fffffff irq 16 at device 2.0 on pci0
Apr  6 18:55:57 freenas vgapci0: Boot video device
Apr  6 18:55:57 freenas ahci0: <AHCI SATA controller> port 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf020-0xf03f mem 0x90916000-0x909167ff irq 19 at device 19.0 on pci0
Apr  6 18:55:57 freenas ahci0: AHCI v1.30 with 2 3Gbps ports, Port Multiplier not supported
Apr  6 18:55:57 freenas ahcich0: <AHCI channel> at channel 0 on ahci0
Apr  6 18:55:57 freenas ahcich1: <AHCI channel> at channel 1 on ahci0
Apr  6 18:55:57 freenas xhci0: <Intel BayTrail USB 3.0 controller> mem 0x90900000-0x9090ffff irq 20 at device 20.0 on pci0
Apr  6 18:55:57 freenas xhci0: 32 bytes context size, 64-bit DMA
Apr  6 18:55:57 freenas xhci0: Port routing mask set to 0xffffffff
Apr  6 18:55:57 freenas usbus0 on xhci0
Apr  6 18:55:57 freenas usbus0: 5.0Gbps Super Speed USB v3.0
Apr  6 18:55:57 freenas pci0: <encrypt/decrypt> at device 26.0 (no driver attached)
Apr  6 18:55:57 freenas pci0: <multimedia, HDA> at device 27.0 (no driver attached)
Apr  6 18:55:57 freenas pcib1: <ACPI PCI-PCI bridge> irq 16 at device 28.0 on pci0
Apr  6 18:55:57 freenas pcib1: [GIANT-LOCKED]
Apr  6 18:55:57 freenas pci1: <ACPI PCI bus> on pcib1
Apr  6 18:55:57 freenas em0: <Intel(R) PRO/1000 Network Connection 7.6.1-k> port 0xe000-0xe01f mem 0x908c0000-0x908dffff,0x90800000-0x9087ffff,0x908e0000-0x908e3fff irq 16 at device 0.0 on pci1
Apr  6 18:55:57 freenas em0: Using MSIX interrupts with 3 vectors
Apr  6 18:55:57 freenas em0: Ethernet address: 68:05:ca:28:b2:da
Apr  6 18:55:57 freenas pcib2: <ACPI PCI-PCI bridge> irq 17 at device 28.1 on pci0
Apr  6 18:55:57 freenas pcib2: [GIANT-LOCKED]
Apr  6 18:55:57 freenas pcib3: <ACPI PCI-PCI bridge> irq 18 at device 28.2 on pci0
Apr  6 18:55:57 freenas pcib3: [GIANT-LOCKED]
Apr  6 18:55:57 freenas pci2: <ACPI PCI bus> on pcib3
Apr  6 18:55:57 freenas re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 0xd000-0xd0ff mem 0x90704000-0x90704fff,0x90700000-0x90703fff irq 18 at device 0.0 on pci2
Apr  6 18:55:57 freenas re0: Using 1 MSI-X message
Apr  6 18:55:57 freenas re0: Chip rev. 0x4c000000
Apr  6 18:55:57 freenas re0: MAC rev. 0x00000000
Apr  6 18:55:57 freenas miibus0: <MII bus> on re0
Apr  6 18:55:57 freenas rgephy0: <RTL8251 1000BASE-T media interface> PHY 1 on miibus0
Apr  6 18:55:57 freenas rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow
Apr  6 18:55:57 freenas re0: Using defaults for TSO: 65518/35/2048
Apr  6 18:55:57 freenas re0: Ethernet address: d0:50:99:2b:a9:01
Apr  6 18:55:57 freenas pcib4: <ACPI PCI-PCI bridge> irq 19 at device 28.3 on pci0
Apr  6 18:55:57 freenas pcib4: [GIANT-LOCKED]
Apr  6 18:55:57 freenas pci3: <ACPI PCI bus> on pcib4
Apr  6 18:55:57 freenas ahci1: <ASMedia ASM1062 AHCI SATA controller> port 0xc050-0xc057,0xc040-0xc043,0xc030-0xc037,0xc020-0xc023,0xc000-0xc01f mem 0x90600000-0x906001ff irq 19 at device 0.0 on pci3
Apr  6 18:55:57 freenas ahci1: AHCI v1.20 with 2 6Gbps ports, Port Multiplier supported
Apr  6 18:55:57 freenas ahci1: quirks=0xc00000<NOCCS,NOAUX>
Apr  6 18:55:57 freenas ahcich2: <AHCI channel> at channel 0 on ahci1
Apr  6 18:55:57 freenas ahcich3: <AHCI channel> at channel 1 on ahci1
Apr  6 18:55:57 freenas isab0: <PCI-ISA bridge> at device 31.0 on pci0
Apr  6 18:55:57 freenas isa0: <ISA bus> on isab0
Apr  6 18:55:57 freenas acpi_button0: <Power Button> on acpi0
Apr  6 18:55:57 freenas acpi_button1: <Sleep Button> on acpi0
Apr  6 18:55:57 freenas acpi_tz0: <Thermal Zone> on acpi0
Apr  6 18:55:57 freenas uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
Apr  6 18:55:57 freenas ichwd0: <Intel Bay Trail SoC watchdog timer> on isa0
Apr  6 18:55:57 freenas wbwd0: <Nuvoton NCT6776 (0xc3/0x33) Watchdog Timer> at port 0x2e-0x2f on isa0
Apr  6 18:55:57 freenas orm0: <ISA Option ROM> at iomem 0xce800-0xcf7ff on isa0
Apr  6 18:55:57 freenas atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
Apr  6 18:55:57 freenas atkbd0: <AT Keyboard> irq 1 on atkbdc0
Apr  6 18:55:57 freenas kbd0 at atkbd0
Apr  6 18:55:57 freenas atkbd0: [GIANT-LOCKED]
Apr  6 18:55:57 freenas coretemp0: <CPU On-Die Thermal Sensors> on cpu0
Apr  6 18:55:57 freenas est0: <Enhanced SpeedStep Frequency Control> on cpu0
Apr  6 18:55:57 freenas coretemp1: <CPU On-Die Thermal Sensors> on cpu1
Apr  6 18:55:57 freenas est1: <Enhanced SpeedStep Frequency Control> on cpu1
Apr  6 18:55:57 freenas coretemp2: <CPU On-Die Thermal Sensors> on cpu2
Apr  6 18:55:57 freenas est2: <Enhanced SpeedStep Frequency Control> on cpu2
Apr  6 18:55:57 freenas coretemp3: <CPU On-Die Thermal Sensors> on cpu3
Apr  6 18:55:57 freenas est3: <Enhanced SpeedStep Frequency Control> on cpu3
Apr  6 18:55:57 freenas ZFS filesystem version: 5
Apr  6 18:55:57 freenas ZFS storage pool version: features support (5000)
Apr  6 18:55:57 freenas Timecounters tick every 1.000 msec
Apr  6 18:55:57 freenas freenas_sysctl: adding account.
Apr  6 18:55:57 freenas freenas_sysctl: adding directoryservice.
Apr  6 18:55:57 freenas freenas_sysctl: adding middlewared.
Apr  6 18:55:57 freenas freenas_sysctl: adding network.
Apr  6 18:55:57 freenas freenas_sysctl: adding services.
Apr  6 18:55:57 freenas ipfw2 (+ipv6) initialized, divert enabled, nat enabled, default to accept, logging disabled
Apr  6 18:55:57 freenas ugen0.1: <0x8086 XHCI root HUB> at usbus0
Apr  6 18:55:57 freenas uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
Apr  6 18:55:57 freenas uhub0: 7 ports with 7 removable, self powered
Apr  6 18:55:57 freenas ugen0.2: <ASRock ASM107x> at usbus0
Apr  6 18:55:57 freenas uhub1 on uhub0
Apr  6 18:55:57 freenas uhub1: <ASRock ASM107x, class 9/0, rev 2.10/1.00, addr 1> on usbus0
Apr  6 18:55:57 freenas uhub1: MTT enabled
Apr  6 18:55:57 freenas uhub1: 4 ports with 4 removable, self powered
Apr  6 18:55:57 freenas ugen0.3: <CHICONY HP Basic USB Keyboard> at usbus0
Apr  6 18:55:57 freenas ukbd0 on uhub1
Apr  6 18:55:57 freenas ukbd0: <CHICONY HP Basic USB Keyboard, class 0/0, rev 2.00/1.30, addr 2> on usbus0
Apr  6 18:55:57 freenas kbd2 at ukbd0
Apr  6 18:55:57 freenas ugen0.4: <vendor 0x05e3 USB2.0 Hub> at usbus0
Apr  6 18:55:57 freenas uhub2 on uhub0
Apr  6 18:55:57 freenas uhub2: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/85.37, addr 3> on usbus0
Apr  6 18:55:57 freenas uhub2: 3 ports with 0 removable, self powered
Apr  6 18:55:57 freenas ugen0.5: <ASRock ASM107x> at usbus0
Apr  6 18:55:57 freenas uhub3 on uhub0
Apr  6 18:55:57 freenas uhub3: <ASRock ASM107x, class 9/0, rev 3.00/1.00, addr 4> on usbus0
Apr  6 18:55:57 freenas uhub3: 4 ports with 4 removable, self powered
Apr  6 18:55:57 freenas ugen0.6: <Inateck NS1066> at usbus0
Apr  6 18:55:57 freenas umass0 on uhub3
Apr  6 18:55:57 freenas umass0: <Inateck NS1066, class 0/0, rev 3.00/1.00, addr 5> on usbus0
Apr  6 18:55:57 freenas umass0:  SCSI over Bulk-Only; quirks = 0x8100
Apr  6 18:55:57 freenas umass0:5:0: Attached to scbus5
Apr  6 18:55:57 freenas (probe0:umass-sim0:0:0:0): REPORT LUNS. CDB: a0 00 00 00 00 00 00 00 00 10 00 00
Apr  6 18:55:57 freenas (probe0:umass-sim0:0:0:0): CAM status: SCSI Status Error
Apr  6 18:55:57 freenas (probe0:umass-sim0:0:0:0): SCSI status: Check Condition
Apr  6 18:55:57 freenas (probe0:umass-sim0:0:0:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command operation code)
Apr  6 18:55:57 freenas (probe0:umass-sim0:0:0:0): Error 22, Unretryable error
Apr  6 18:55:57 freenas ugen0.7: <SanDisk Ultra> at usbus0
Apr  6 18:55:57 freenas umass1 on uhub3
Apr  6 18:55:57 freenas umass1: <SanDisk Ultra, class 0/0, rev 3.00/1.00, addr 6> on usbus0
Apr  6 18:55:57 freenas umass1:  SCSI over Bulk-Only; quirks = 0x8100
Apr  6 18:55:57 freenas umass1:6:1: Attached to scbus6
Apr  6 18:55:57 freenas ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
Apr  6 18:55:57 freenas ada0: <WDC WD40EFRX-68WT0N0 80.00A80> ACS-2 ATA SATA 3.x device
Apr  6 18:55:57 freenas ada0: Serial Number WD-WCC4EDYVY00K
Apr  6 18:55:57 freenas ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Apr  6 18:55:57 freenas ada0: Command Queueing enabled
Apr  6 18:55:57 freenas ada0: 3815447MB (7814037168 512 byte sectors)
Apr  6 18:55:57 freenas ada0: quirks=0x1<4K>
Apr  6 18:55:57 freenas ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
Apr  6 18:55:57 freenas ada1: <WDC WD40EFRX-68WT0N0 80.00A80> ACS-2 ATA SATA 3.x device
Apr  6 18:55:57 freenas ada1: Serial Number WD-WCC4ER84E2KP
Apr  6 18:55:57 freenas ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Apr  6 18:55:57 freenas ada1: Command Queueing enabled
Apr  6 18:55:57 freenas ada1: 3815447MB (7814037168 512 byte sectors)
Apr  6 18:55:57 freenas ada1: quirks=0x1<4K>
Apr  6 18:55:57 freenas ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
Apr  6 18:55:57 freenas ada2: <WDC WD40EFRX-68WT0N0 80.00A80> ACS-2 ATA SATA 3.x device
Apr  6 18:55:57 freenas ada2: Serial Number WD-WCC4E48Y5930
Apr  6 18:55:57 freenas ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Apr  6 18:55:57 freenas ada2: Command Queueing enabled
Apr  6 18:55:57 freenas ada2: 3815447MB (7814037168 512 byte sectors)
Apr  6 18:55:57 freenas ada2: quirks=0x1<4K>
Apr  6 18:55:57 freenas ada3 at ahcich3 bus 0 scbus3 target 0 lun 0
Apr  6 18:55:57 freenas ada3: <WDC WD40EFRX-68WT0N0 80.00A80> ACS-2 ATA SATA 3.x device
Apr  6 18:55:57 freenas ada3: Serial Number WD-WCC4EF87VY7J
Apr  6 18:55:57 freenas ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Apr  6 18:55:57 freenas ada3: Command Queueing enabled
Apr  6 18:55:57 freenas ada3: 3815447MB (7814037168 512 byte sectors)
Apr  6 18:55:57 freenas ada3: quirks=0x1<4K>
Apr  6 18:55:57 freenas da0 at umass-sim0 bus 0 scbus5 target 0 lun 0
Apr  6 18:55:57 freenas da0: <Inateck  0> Fixed Direct Access SPC-4 SCSI device
Apr  6 18:55:57 freenas da0: Serial Number 0123456789ABDFE
Apr  6 18:55:57 freenas da0: 400.000MB/s transfers
Apr  6 18:55:57 freenas da0: 61057MB (125045424 512 byte sectors)
Apr  6 18:55:57 freenas da0: quirks=0x2<NO_6_BYTE>
Apr  6 18:55:57 freenas da1 at umass-sim1 bus 1 scbus6 target 0 lun 0
Apr  6 18:55:57 freenas da1: <SanDisk Ultra 1.00> Removable Direct Access SPC-4 SCSI device
Apr  6 18:55:57 freenas da1: Serial Number 4C531001351213105583
Apr  6 18:55:57 freenas da1: 400.000MB/s transfers
Apr  6 18:55:57 freenas da1: 118464MB (242614272 512 byte sectors)
Apr  6 18:55:57 freenas da1: quirks=0x2<NO_6_BYTE>
Apr  6 18:55:57 freenas random: unblocking device.
Apr  6 18:55:57 freenas Trying to mount root from zfs:freenas-boot/ROOT/default []...
Apr  6 18:55:57 freenas kernel: em0: link state changed to UP
Apr  6 18:55:57 freenas kernel: em0: link state changed to UP
Apr  6 18:55:57 freenas kernel: re0: link state changed to DOWN
Apr  6 18:55:57 freenas kernel: re0: link state changed to DOWN
Apr  6 18:55:57 freenas GEOM_MIRROR: Device mirror/swap0 launched (2/2).
Apr  6 18:55:57 freenas GEOM_MIRROR: Device mirror/swap1 launched (2/2).
Apr  6 18:55:57 freenas GEOM_ELI: Device mirror/swap0.eli created.
Apr  6 18:55:57 freenas GEOM_ELI: Encryption: AES-XTS 128
Apr  6 18:55:57 freenas GEOM_ELI:     Crypto: software
Apr  6 18:55:57 freenas GEOM_ELI: Device mirror/swap1.eli created.
Apr  6 18:55:57 freenas GEOM_ELI: Encryption: AES-XTS 128
Apr  6 18:55:57 freenas GEOM_ELI:     Crypto: software
Apr  6 18:55:57 freenas pmc: Unknown Intel CPU.
Apr  6 18:55:57 freenas hwpmc: SOFT/16/64/0x67<INT,USR,SYS,REA,WRI>
Apr  6 18:55:57 freenas kernel: em0: link state changed to DOWN
Apr  6 18:55:57 freenas kernel: em0: link state changed to DOWN
Apr  6 18:55:57 freenas kernel: em0: link state changed to UP
Apr  6 18:55:57 freenas kernel: em0: link state changed to UP



Regards
lnix
 

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
I'm not super strong in syslog, but I'm pretty sure page faults are memory, and that tells me that your non recomended RAM, unsurprisingly, is killing your pools, which is why it's not recommended.
memtest time. figure out which sticks are bad and replace them. or better yet replace the mobo/cpu/ram
 

lnix

Dabbler
Joined
Aug 16, 2014
Messages
29
I did a first memtest and there are a lot errors...I will test each module. Tommorrow I will get my new mem and I will test it again with the new mem. If there any errors I will buy server mobo/server cpu/ecc-ram in mini-itx
 

Attachments

  • mem.jpg
    mem.jpg
    298.7 KB · Views: 450

artlessknave

Wizard
Joined
Oct 29, 2016
Messages
1,506
excellent, glad we found the problem.
 
Top