ZFS - pool I/O is currently suspended

lenard2000

Cadet
Joined
Oct 18, 2023
Messages
8
Hi All,

Out of nowhere, my drive was pushed out of rotation and it was marked the pool as DEGRADED. To be fair this pool only has one 16TB drive. I know that this is not the best solution but I was waiting for my second drive to be shipped in.

At this point after a lot of research, I have managed to force my drive to go back online with the following steps:

This got the drive and the pool back online but I had a lot of ZFS Errors. When I looked in deeper (zpool status -v Storage1) I found that the errors were due to some files within the system. When I attempted to resolve them by deleting them, it seems that it was successful. However, then I noticed that I caused the error 'pool I/O is currently suspended'. The only way to bypass the error is to restart the system, but then the same files that I deleted come back again.
Screenshot 2023-10-17 234554.png


So I guess my question is the following. How can I get rid of these corrupted files and get my system back in order?

Also, just to rule out potential causes of the issue:
  • I have changed the Sata and power cable of the disk.
  • I have other disks in other pools all working fine
  • Short SMART test was successful and no issues were found with the disk.
Hardware:
QuestionHardware
Motherboard make and modelNot Sure - If this is a requirement I can open the server as it is not in my docs.
CPU make and modelIntel(R) Core(TM) i5 CPU 760 @ 2.80GHz
RAM quantity32GB DDR3
Hard drives, quantity, model numbers, and RAID configuration, including boot drivesSee the below table
Hard disk controllersNo External Hard Disk Controllers
Network CardsNo External Network Card.

Hard Disk Configuration:
Brand
Model
Type
Size
Speed
Pool
Deploy Date
Seagate​
Exos X18 Enterprise Class​
HDD (CMR)​
16TB​
7200 RPM​
Storage 1​
06/09/23​
Seagate​
Baracuda​
HDD (SMR)​
500GB​
7200 RPM​
Media Pool​
WD​
Blue (Big)​
HDD (SMR)​
500GB​
Unknown​
Media Pool​
WD​
Blue (Small)​
HDD (SMR)​
500GB​
7200 RPM​
Media Pool​
Crucial​
BX500​
SSD​
1TB​
6GB/s​
Apps​
04/09/23​
GoodRam​
SSD​
120GB​
Boot​
Crucial​
BX500​
SSD​
240GB​
6GB/s​
Boot​
Not Yet Deployed​


At this point I am truly lost and any help is much appreciated
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
You have 40 checksum errors (bad), so, what was intended read did not pass checksum test, so, something that was written didn't make it correctly, or, was corrupted after writing. That your disk passed a short smart test doesn't mean it is ok. One useful item is to post the (long) results of a smartctl -a /dev/sd?, you can always do a zpool status -L Storage1 to get drive letters like sda, etc.

So, all drives are hooked to the motherboard connectors? So, the affected drive is the Seagate. Was it new, used? Motherboard info can be useful, yes. What is the power supply?

Replacing cables was good, however, if the data was already corrupted from bad cables, it doesn't fix the corruption.
 

lenard2000

Cadet
Joined
Oct 18, 2023
Messages
8
Hi Sfatula,

The checksum errors went over 2000 at one point. It's just that I cleared them and kept running scrubs in hopes that the problem would go away. I am aware that changing the SATA cable won't fix the corruption; however, I eliminated a potential root cause.

The drive was brand new, and it was installed on 06/09/23.

I am currently running the test that you recommended, and I will update soon.
 

lenard2000

Cadet
Joined
Oct 18, 2023
Messages
8
Ran the command that you recommended and I got the following:

Code:
admin@truenas[~]$ sudo smartctl -a /dev/sdd2   
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.107+truenas] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     ST16000NM000J-2TW103
Serial Number:    ZRS0EQZC
LU WWN Device Id: 5 000c50 0e38a5942
Firmware Version: SN02
User Capacity:    16,000,900,661,248 bytes [16.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-4 (minor revision not indicated)
SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Oct 19 11:44:23 2023 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  567) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (1360) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x70bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   082   064   044    Pre-fail  Always       -       150305248
  3 Spin_Up_Time            0x0003   094   093   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       96
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   074   060   045    Pre-fail  Always       -       27190013
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       1008
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       96
 18 Unknown_Attribute       0x000b   100   100   050    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   064   049   000    Old_age   Always       -       36 (Min/Max 31/39)
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       83
193 Load_Cycle_Count        0x0032   095   095   000    Old_age   Always       -       10275
194 Temperature_Celsius     0x0022   036   042   000    Old_age   Always       -       36 (0 26 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0023   100   100   001    Pre-fail  Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       367 (49 148 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       6211762320
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       288648998641

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      90%       960         -
# 2  Short offline       Completed without error       00%       937         -
# 3  Extended offline    Interrupted (host reset)      70%       934         -
# 4  Short offline       Completed without error       00%       761         -
# 5  Short offline       Completed without error       00%       593         -
# 6  Short offline       Completed without error       00%       425         -
# 7  Short offline       Completed without error       00%       268         -
# 8  Short offline       Completed without error       00%        99         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


I will check the motherboard name and power supply name, later tonight. In the meantime, I hope that this gives you more insights into my problem
 

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
Yes, so the drive indeed looks good that's normal for Seagate.

The metadata for the pool may be corrupted. If that is the cause your likely only resolution is a restore. My questions here are to prevent future problems. I've seen firmware corrupt drives too. That's why I want to see what your motherboard is, and firmware version too.
 

PhilD13

Patron
Joined
Sep 18, 2020
Messages
203
You could do a sudo dmidecode | more

from the command line and it will show what motherboard and bios and version you have.
 

lenard2000

Cadet
Joined
Oct 18, 2023
Messages
8
I ran the command and the following is the result:

Code:
BIOS Information
        Vendor: American Megatrends Inc.
        Version: 1301
        Release Date: 08/27/2010
        Address: 0xF0000
        Runtime Size: 64 kB
        ROM Size: 2 MB
        Characteristics:
                ISA is supported
                PCI is supported
                PNP is supported
                APM is supported
                BIOS is upgradeable
                BIOS shadowing is allowed
                ESCD support is available
                Boot from CD is supported
                Selectable boot is supported
                BIOS ROM is socketed
                EDD is supported
                5.25"/1.2 MB floppy services are supported (int 13h)
                3.5"/720 kB floppy services are supported (int 13h)
                3.5"/2.88 MB floppy services are supported (int 13h)
                Print screen service is supported (int 5h)
                8042 keyboard services are supported (int 9h)
                Serial services are supported (int 14h)
                Printer services are supported (int 17h)
                CGA/mono video services are supported (int 10h)
                ACPI is supported
                USB legacy is supported
                LS-120 boot is supported
                ATAPI Zip drive boot is supported
                BIOS boot specification is supported
                Targeted content distribution is supported
        BIOS Revision: 8.15

Handle 0x0001, DMI type 1, 27 bytes
System Information
        Manufacturer: System manufacturer
        Product Name: System Product Name
        Version: System Version
        Serial Number: System Serial Number
        UUID: a2884a40-fe8d-11d5-b2c9-20cf30e7b9c5
        Wake-up Type: Power Switch
        SKU Number: To Be Filled By O.E.M.
        Family: To Be Filled By O.E.M.


...skipping 1 line
Base Board Information
        Manufacturer: ASUSTeK Computer INC.
        Product Name: P7P55D-E LX
        Version: Rev 1.xx
        Serial Number: MT7009K47100198
        Asset Tag: To Be Filled By O.E.M.
        Features:
                Board is a hosting board
                Board is replaceable
        Location In Chassis: To Be Filled By O.E.M.
        Chassis Handle: 0x0003
        Type: Motherboard
        Contained Object Handles: 0

Handle 0x0003, DMI type 3, 21 bytes
Chassis Information
        Manufacturer: Chassis Manufacture
        Type: Desktop
        Lock: Not Present
        Version: Chassis Version
        Serial Number: Chassis Serial Number
        Asset Tag: Asset-1234567890
        Boot-up State: Safe
        Power Supply State: Safe
        Thermal State: Safe
        Security Status: None
        OEM Information: 0x00000001
        Height: Unspecified
        Number Of Power Cords: 1
        Contained Elements: 0

Handle 0x0004, DMI type 4, 42 bytes
Processor Information
        Socket Designation: LGA1156
        Type: Central Processor
        Family: Core i5
        Manufacturer: Intel
        ID: E5 06 01 00 FF FB EB BF
        Signature: Type 0, Family 6, Model 30, Stepping 5
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
                TSC (Time stamp counter)
                MSR (Model specific registers)
                PAE (Physical address extension)
                MCE (Machine check exception)
                CX8 (CMPXCHG8 instruction supported)
                APIC (On-chip APIC hardware supported)
                SEP (Fast system call)

...skipping 1 line
                PGE (Page global enable)
                MCA (Machine check architecture)
                CMOV (Conditional move instruction supported)
                PAT (Page attribute table)
                PSE-36 (36-bit page size extension)
                CLFSH (CLFLUSH instruction supported)
                DS (Debug store)
                ACPI (ACPI supported)
                MMX (MMX technology supported)
                FXSR (FXSAVE and FXSTOR instructions supported)
                SSE (Streaming SIMD extensions)
                SSE2 (Streaming SIMD extensions 2)
                SS (Self-snoop)
                HTT (Multi-threading)
                TM (Thermal monitor supported)
                PBE (Pending break enabled)
        Version: Intel(R) Core(TM) i5 CPU 760 @ 2.80GHz
        Voltage: 1.0 V
        External Clock: 133 MHz
        Max Speed: 3800 MHz
        Current Speed: 2800 MHz
        Status: Populated, Enabled
        Upgrade: Other
        L1 Cache Handle: 0x0005
        L2 Cache Handle: 0x0006
        L3 Cache Handle: 0x0007
        Serial Number: To Be Filled By O.E.M.
        Asset Tag: To Be Filled By O.E.M.
        Part Number: To Be Filled By O.E.M.
        Core Count: 4
        Core Enabled: 4
        Thread Count: 4
        Characteristics:
                64-bit capable

Handle 0x0005, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L1-Cache
        Configuration: Enabled, Not Socketed, Level 1
        Operational Mode: Write Through
        Location: Internal
        Installed Size: 256 kB
        Maximum Size: 256 kB
        Supported SRAM Types:
                Other
        Installed SRAM Type: Other
        Speed: Unknown
        Error Correction Type: Parity
        System Type: Instruction
        Associativity: 4-way Set-associative


...skipping 1 line
Cache Information
        Socket Designation: L2-Cache
        Configuration: Enabled, Not Socketed, Level 2
        Operational Mode: Write Through
        Location: Internal
        Installed Size: 1 MB
        Maximum Size: 1 MB
        Supported SRAM Types:
                Other
        Installed SRAM Type: Other
        Speed: Unknown
        Error Correction Type: Single-bit ECC
        System Type: Unified
        Associativity: 8-way Set-associative

Handle 0x0007, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L3-Cache
        Configuration: Enabled, Not Socketed, Level 3
        Operational Mode: Write Back
        Location: Internal
        Installed Size: 8 MB
        Maximum Size: 8 MB
        Supported SRAM Types:
                Other
        Installed SRAM Type: Other
        Speed: Unknown
        Error Correction Type: Single-bit ECC
        System Type: Unified
        Associativity: 16-way Set-associative

Handle 0x0008, DMI type 5, 24 bytes
Memory Controller Information
        Error Detecting Method: 64-bit ECC
        Error Correcting Capabilities:
                None
        Supported Interleave: One-way Interleave
        Current Interleave: One-way Interleave
        Maximum Memory Module Size: 2048 MB
        Maximum Total Memory Size: 8192 MB
        Supported Speeds:
                Other
        Supported Memory Types:
                DIMM
                SDRAM
        Memory Module Voltage: 3.3 V
        Associated Memory Slots: 4
                0x0009
                0x000A
                0x000B
                0x000C

...skipping 1 line
                None

Handle 0x0009, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: DIMM0
        Bank Connections: 0 1
        Current Speed: Unknown
        Type: DIMM SDRAM
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x000A, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: DIMM1
        Bank Connections: 2 3
        Current Speed: Unknown
        Type: DIMM SDRAM
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x000B, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: DIMM2
        Bank Connections: 4 5
        Current Speed: Unknown
        Type: DIMM SDRAM
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x000C, DMI type 6, 12 bytes
Memory Module Information
        Socket Designation: DIMM3
        Bank Connections: 6 7
        Current Speed: Unknown
        Type: DIMM SDRAM
        Installed Size: 8192 MB (Double-bank Connection)
        Enabled Size: 8192 MB (Double-bank Connection)
        Error Status: OK

Handle 0x000D, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: PS/2 Keyboard
        Internal Connector Type: None
        External Reference Designator: PS/2 Keyboard
        External Connector Type: PS/2
        Port Type: Keyboard Port

Handle 0x000E, DMI type 8, 9 bytes

...skipping 1 line
        Internal Reference Designator: USB9_10
        Internal Connector Type: None
        External Reference Designator: USB9_10
        External Connector Type: Access Bus (USB)
        Port Type: USB

Handle 0x000F, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: USB11_12
        Internal Connector Type: None
        External Reference Designator: USB11_12
        External Connector Type: Access Bus (USB)
        Port Type: USB

Handle 0x0010, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: USB13_14
        Internal Connector Type: None
        External Reference Designator: USB13_14
        External Connector Type: Access Bus (USB)
        Port Type: USB

Handle 0x0011, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: GbE LAN
        Internal Connector Type: None
        External Reference Designator: GbE LAN
        External Connector Type: RJ-45
        Port Type: Network Port

Handle 0x0012, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: AUDIO
        Internal Connector Type: None
        External Reference Designator: AUDIO
        External Connector Type: Other
        Port Type: Audio Port

Handle 0x0013, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: PS/2 Mouse
        Internal Connector Type: None
        External Reference Designator: PS/2 Mouse
        External Connector Type: PS/2
        Port Type: Mouse Port

Handle 0x0014, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA1
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified

...skipping 1 line
        Port Type: SATA

Handle 0x0015, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA2
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x0016, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA3
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x0017, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA4
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x0018, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA5
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x0019, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA6
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x001A, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: SATA_6G_12
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x001B, DMI type 8, 9 bytes

...skipping 1 line
        Internal Reference Designator: USB1_2
        Internal Connector Type: Access Bus (USB)
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: USB

Handle 0x001C, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: USB3_4
        Internal Connector Type: Access Bus (USB)
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: USB

Handle 0x001D, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: USB5_6
        Internal Connector Type: Access Bus (USB)
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: USB

Handle 0x001E, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: USB7_8
        Internal Connector Type: Access Bus (USB)
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: USB

Handle 0x001F, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: USB_3_12
        Internal Connector Type: Access Bus (USB)
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: USB

Handle 0x0020, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: CD
        Internal Connector Type: On Board Sound Input From CD-ROM
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: Audio Port

Handle 0x0021, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: AAFP
        Internal Connector Type: Mini Jack (headphones)
        External Reference Designator: Not Specified

...skipping 1 line
        Port Type: Audio Port

Handle 0x0022, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: CPU_FAN
        Internal Connector Type: Other
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: Other

Handle 0x0023, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: CHA_FAN1
        Internal Connector Type: Other
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: Other

Handle 0x0024, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: PWR_FAN
        Internal Connector Type: Other
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: Other

Handle 0x0025, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: CHA_FAN2
        Internal Connector Type: Other
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: Other

Handle 0x0026, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: PATA_IDE
        Internal Connector Type: On Board IDE
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: Other

Handle 0x0027, DMI type 8, 9 bytes
Port Connector Information
        Internal Reference Designator: F_ESATA
        Internal Connector Type: SAS/SATA Plug Receptacle
        External Reference Designator: Not Specified
        External Connector Type: None
        Port Type: SATA

Handle 0x0028, DMI type 9, 17 bytes

...skipping 1 line
        Designation: PCIEX1_1
        Type: 32-bit PCI Express
        Current Usage: In Use
        Length: Short
        ID: 1
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported
        Bus Address: ffff:01:00.0

Handle 0x0029, DMI type 9, 17 bytes
System Slot Information
        Designation: PCIEX16_1
        Type: 32-bit PCI Express
        Current Usage: Available
        Length: Short
        ID: 2
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported

Handle 0x002A, DMI type 9, 17 bytes
System Slot Information
        Designation: PCIEX1_2
        Type: 32-bit PCI Express
        Current Usage: Available
        Length: Short
        ID: 3
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported

Handle 0x002B, DMI type 9, 17 bytes
System Slot Information
        Designation: PCIEX1_3
        Type: 32-bit PCI Express
        Current Usage: Available
        Length: Short
        ID: 4
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported

Handle 0x002C, DMI type 9, 17 bytes
System Slot Information
        Designation: PCI_1
        Type: 32-bit PCI

...skipping 1 line
        Length: Short
        ID: 5
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported

Handle 0x002D, DMI type 9, 17 bytes
System Slot Information
        Designation: PCI_2
        Type: 32-bit PCI
        Current Usage: Available
        Length: Short
        ID: 6
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported

Handle 0x002E, DMI type 9, 17 bytes
System Slot Information
        Designation: PCIEX1_4
        Type: 32-bit PCI Express
        Current Usage: Available
        Length: Short
        ID: 7
        Characteristics:
                3.3 V is provided
                Opening is shared
                PME signal is supported

Handle 0x002F, DMI type 10, 6 bytes
On Board Device Information
        Type: Ethernet
        Status: Enabled
        Description: Onboard Ethernet

Handle 0x0030, DMI type 10, 6 bytes
On Board Device Information
        Type: Sound
        Status: Enabled
        Description: Onboard Audio

Handle 0x0031, DMI type 11, 5 bytes
OEM Strings
        String 1: 20CF30E7B9C5
        String 2: To Be Filled By O.E.M.
        String 3: To Be Filled By O.E.M.
        String 4: To Be Filled By O.E.M.

Handle 0x0032, DMI type 13, 22 bytes

...skipping 1 line
        Language Description Format: Long
        Installable Languages: 6
                en|US|iso8859-1
                zh|ZH|iso8859-1
                de|DE|iso8859-1
                cn|CN|iso8859-1
                fr|FR|iso8859-1
                ja|JP|unicode-1
        Currently Installed Language: en|US|iso8859-1

Handle 0x0033, DMI type 15, 55 bytes
System Event Log
        Area Length: 1008 bytes
        Header Start Offset: 0x2010
        Data Start Offset: 0x2010
        Access Method: OEM-specific
        Access Address: Unknown
        Status: Valid, Not Full
        Change Token: 0x00000000
        Header Format: No Header
        Supported Log Type Descriptors: 1
        Descriptor 1: OEM-specific
        Data Format 1: POST results bitmap

Handle 0x0034, DMI type 16, 15 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: None
        Maximum Capacity: 4 GB
        Error Information Handle: Not Provided
        Number Of Devices: 4

Handle 0x0035, DMI type 19, 15 bytes
Memory Array Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x007FFFFFFFF
        Range Size: 32 GB
        Physical Array Handle: 0x0034
        Partition Width: 4

Handle 0x0036, DMI type 17, 28 bytes
Memory Device
        Array Handle: 0x0034
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM0

...skipping 1 line
        Type: DDR
        Type Detail: Synchronous
        Speed: 1333 MT/s
        Manufacturer: Manufacturer0
        Serial Number: SerNum0
        Asset Tag: AssetTagNum0
        Part Number: PartNum0
        Rank: Unknown

Handle 0x0037, DMI type 20, 19 bytes
Memory Device Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x001FFFFFFFF
        Range Size: 8 GB
        Physical Device Handle: 0x0036
        Memory Array Mapped Address Handle: 0x0035
        Partition Row Position: 1
        Interleaved Data Depth: 1

Handle 0x0038, DMI type 17, 28 bytes
Memory Device
        Array Handle: 0x0034
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM1
        Bank Locator: BANK1
        Type: DDR
        Type Detail: Synchronous
        Speed: 1333 MT/s
        Manufacturer: Manufacturer1
        Serial Number: SerNum1
        Asset Tag: AssetTagNum1
        Part Number: PartNum1
        Rank: Unknown

Handle 0x0039, DMI type 20, 19 bytes
Memory Device Mapped Address
        Starting Address: 0x00200000000
        Ending Address: 0x003FFFFFFFF
        Range Size: 8 GB
        Physical Device Handle: 0x0038
        Memory Array Mapped Address Handle: 0x0035
        Partition Row Position: 1
        Interleaved Data Depth: 1

Handle 0x003A, DMI type 17, 28 bytes
Memory Device

...skipping 1 line
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM2
        Bank Locator: BANK2
        Type: DDR
        Type Detail: Synchronous
        Speed: 1333 MT/s
        Manufacturer: Manufacturer2
        Serial Number: SerNum2
        Asset Tag: AssetTagNum2
        Part Number: PartNum2
        Rank: Unknown

Handle 0x003B, DMI type 20, 19 bytes
Memory Device Mapped Address
        Starting Address: 0x00000000000
        Ending Address: 0x001FFFFFFFF
        Range Size: 8 GB
        Physical Device Handle: 0x003A
        Memory Array Mapped Address Handle: 0x0035
        Partition Row Position: 1
        Interleaved Data Depth: 1

Handle 0x003C, DMI type 17, 28 bytes
Memory Device
        Array Handle: 0x0034
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 8 GB
        Form Factor: DIMM
        Set: None
        Locator: DIMM3
        Bank Locator: BANK3
        Type: DDR
        Type Detail: Synchronous
        Speed: 1333 MT/s
        Manufacturer: Manufacturer3
        Serial Number: SerNum3
        Asset Tag: AssetTagNum3
        Part Number: PartNum3
        Rank: Unknown

Handle 0x003D, DMI type 20, 19 bytes
Memory Device Mapped Address
        Starting Address: 0x00200000000
        Ending Address: 0x003FFFFFFFF

...skipping 1 line
        Physical Device Handle: 0x003C
        Memory Array Mapped Address Handle: 0x0035
        Partition Row Position: 1
        Interleaved Data Depth: 1

Handle 0x003E, DMI type 32, 20 bytes
System Boot Information
        Status: No errors detected

Handle 0x003F, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: Onboard Ethernet
        Type: Ethernet
        Status: Enabled
        Type Instance: 0

Handle 0x0040, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: Onboard Audio
        Type: Sound
        Status: Enabled
        Type Instance: 0

Handle 0x0041, DMI type 127, 4 bytes
End Of Table


My motherboard is Asustek P7P55D-E LX. If the only way to restore the disk to its prior state is to restore it, I will back it up, delete the pool and copy the data back. What do you guys think?

Also any ideas on what could have led to the issue to avoid it in the future? And if I had raid and ran a scrub, would I have avoided the issue?
 

lenard2000

Cadet
Joined
Oct 18, 2023
Messages
8
I am ready to proceed with restoring data from the backup I just did. Do you guys have any idea what caused it before I restore?
 

PhilD13

Patron
Joined
Sep 18, 2020
Messages
203
No I don't really have any concrete ideas.

I looked up the motherboard model from above and specs show that motherboard P7P55D-E LX has a maximum memory of 16GB and you show 32 installed. That will cause issues.

I am also curious though if the use of 3 SMR type drives which are a big no no with Truenas, with ZFS filesystems in general, or with any type of Raid, caused an issue where the SATA bus was overloaded with the SMR's thrashing about and caused issues. This caused lots of people to corrupt and lose data until it was figured out what the issue was.

Could be a combination of the 16TB CMR drive being on the 3 GB/s SATA bus (SATA Version is: SATA 3.3, 6.0 Gb/s (current: 3.0 Gb/s)) on the motherboard along with the SMR drives caused an issue whereby data was corrupted.

Think about replacing the SMR drives with CMR type drives,
Reduce the memory to motherboard specs
Try plugging the 16TB in to one of the Grey SATA (6gb/s) connectors.
Verify that the bios is set to ACHI and not Raid
 

Davvo

MVP
Joined
Jul 12, 2022
Messages
3,222
You should complete a long smart test, without it there is no way to know the actual state of the drive.

In order to run it go to the terminal and use tmux to create a new session using tmux new, then smartctl -t long /dev/sdd2; you can then press CTRL+B and then D to log out of the session; wait a day, then use smartctl -a /dev/sdd2 to view the results.

Also, ST16000NM000J is a CMR drive. Do not use SMR drives, especially WD ones.
 
Last edited:

sfatula

Guru
Joined
Jul 5, 2022
Messages
608
And if I had raid and ran a scrub, would I have avoided the issue?
If you were using a Raidz# pool or mirrors, you most likely would have avoided the issues, yes. No guarantees without ECC ram, but, very likely not. You would still however have to figure out why you are getting checksum errors. ANY ZFS error needs to be looked into.
 
Top