websmith
Dabbler
- Joined
 - Sep 20, 2018
 
- Messages
 - 38
 
Hi,
Today I got two emails from one of my truenas servers.
First one stated:
Then 1 minute after I received:
I have looked throught the log files on the server and also done a zpool status:
	
	
		
			
		
	
	
	
		
			
		
	
So everything seems to be in order - at least it looks "normal" - I am aware that running with just one drive for ZIL is probably not good, but its good enough for me
So anything else I can look in to find out why truenas suddenly decided that either my pool was offline, or sent wrong mails?
I am a bit scared now, since I tend to take mails like this very serious, but since everything seems to be okay I don't know what to do.
Thanks in advance
	
		
			
		
		
	
			
			Today I got two emails from one of my truenas servers.
First one stated:
Code:
New alerts: * Pool tank2 state is OFFLINE: None Current alerts: * Pool tank2 state is OFFLINE: None
Then 1 minute after I received:
Code:
The following alert has been cleared: * Pool tank2 state is OFFLINE: None
I have looked throught the log files on the server and also done a zpool status:
Code:
 pool: tank2
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
        still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(5) for details.
  scan: scrub repaired 0B in 03:14:35 with 0 errors on Sun Dec  5 03:14:40 2021
config:
        NAME                                            STATE     READ WRITE CKSUM
        tank2                                           ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            gptid/0d932ec4-62c9-11e9-8f21-a0369f09f4ea  ONLINE       0     0     0
            gptid/0e519d79-62c9-11e9-8f21-a0369f09f4ea  ONLINE       0     0     0
        logs
          gpt/tank2_log                                 ONLINE       0     0     0
errors: No known data errors
Code:
root@nas:/var/log # smartctl -a /dev/ada2
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD60EFRX-68L0BN1
Serial Number:    WD-WX11DA8DHFV5
LU WWN Device Id: 5 0014ee 2bb432b67
Firmware Version: 82.00A82
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5700 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Dec 11 00:25:02 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                ( 2804) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 682) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   234   197   021    Pre-fail  Always       -       7266
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       188
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   069   069   000    Old_age   Always       -       22997
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       147
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       108
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       5772
194 Temperature_Celsius     0x0022   114   106   000    Old_age   Always       -       38
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     22777         -
# 2  Extended offline    Completed without error       00%     22058         -
# 3  Extended offline    Completed without error       00%     21313         -
# 4  Extended offline    Completed without error       00%     20594         -
# 5  Extended offline    Completed without error       00%     19851         -
# 6  Extended offline    Completed without error       00%     19108         -
# 7  Extended offline    Completed without error       00%     18389         -
# 8  Extended offline    Completed without error       00%     17645         -
# 9  Extended offline    Completed without error       00%     16926         -
#10  Extended offline    Completed without error       00%     16184         -
#11  Extended offline    Completed without error       00%     15514         -
#12  Extended offline    Completed without error       00%     14771         -
#13  Extended offline    Completed without error       00%     14027         -
#14  Extended offline    Completed without error       00%     13310         -
#15  Extended offline    Completed without error       00%     12568         -
#16  Extended offline    Completed without error       00%     11848         -
#17  Extended offline    Completed without error       00%     11105         -
#18  Extended offline    Completed without error       00%      9643         -
#19  Extended offline    Completed without error       00%      8900         -
#20  Extended offline    Completed without error       00%      8181         -
#21  Extended offline    Completed without error       00%      7439         -
SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.Code:
root@nas:/var/log # smartctl -a /dev/ada4
smartctl 7.2 2020-12-30 r5155 [FreeBSD 12.2-RELEASE-p9 amd64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD60EFRX-68L0BN1
Serial Number:    WD-WX11DA8DHSD2
LU WWN Device Id: 5 0014ee 265ed8ac1
Firmware Version: 82.00A82
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5700 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Dec 11 00:26:09 2021 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                ( 4604) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 700) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   234   197   021    Pre-fail  Always       -       7300
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       188
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   069   069   000    Old_age   Always       -       22980
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       147
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       111
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       5793
194 Temperature_Celsius     0x0022   111   103   000    Old_age   Always       -       41
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     22760         -
# 2  Extended offline    Completed without error       00%     22041         -
# 3  Extended offline    Completed without error       00%     21297         -
# 4  Extended offline    Completed without error       00%     20578         -
# 5  Extended offline    Completed without error       00%     19834         -
# 6  Extended offline    Completed without error       00%     19091         -
# 7  Extended offline    Completed without error       00%     18372         -
# 8  Extended offline    Completed without error       00%     17629         -
# 9  Extended offline    Completed without error       00%     16909         -
#10  Extended offline    Completed without error       00%     16168         -
#11  Extended offline    Completed without error       00%     15497         -
#12  Extended offline    Completed without error       00%     14754         -
#13  Extended offline    Completed without error       00%     14011         -
#14  Extended offline    Completed without error       00%     13293         -
#15  Extended offline    Completed without error       00%     12551         -
#16  Extended offline    Completed without error       00%     11832         -
#17  Extended offline    Completed without error       00%     11089         -
#18  Extended offline    Completed without error       00%      9627         -
#19  Extended offline    Completed without error       00%      8884         -
#20  Extended offline    Completed without error       00%      8164         -
#21  Extended offline    Completed without error       00%      7422         -
SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.So everything seems to be in order - at least it looks "normal" - I am aware that running with just one drive for ZIL is probably not good, but its good enough for me
So anything else I can look in to find out why truenas suddenly decided that either my pool was offline, or sent wrong mails?
I am a bit scared now, since I tend to take mails like this very serious, but since everything seems to be okay I don't know what to do.
Thanks in advance