Hi,
I have two TrueNAS SCALE builds with identical hardware (except the disks): ASRockRack X570D4U, Ryzen 9 5900X, Kingston KSM32ED8 ECC RAM, 750W power supply. On both systems I occasionally see hardware errors in the logs.
Right now I am rsyncing data from one system to the other. On the receiving system I see the hardware errors once every 3-4 hours in the logs:
To me this looks like a RAM ECC error. On both systems Memtest86+ has run for several days without a single error.
Can I safely ignore those errors or is there anything I should do?
Best regards,
AMiGAmann
I have two TrueNAS SCALE builds with identical hardware (except the disks): ASRockRack X570D4U, Ryzen 9 5900X, Kingston KSM32ED8 ECC RAM, 750W power supply. On both systems I occasionally see hardware errors in the logs.
Right now I am rsyncing data from one system to the other. On the receiving system I see the hardware errors once every 3-4 hours in the logs:
Mar 30 03:24:19 TrueNAS kernel: mce: [Hardware Error]: Machine check events logged
Mar 30 03:24:19 TrueNAS kernel: [Hardware Error]: Corrected error, no action required.
Mar 30 03:24:19 TrueNAS kernel: [Hardware Error]: CPU:0 (19:21:2) MC17_STATUS[-|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0x9c2040000000011b
Mar 30 03:24:19 TrueNAS kernel: [Hardware Error]: Error Addr: 0x0000000469e32d80
Mar 30 03:24:19 TrueNAS kernel: [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x447000200a800201
Mar 30 03:24:19 TrueNAS kernel: [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
Mar 30 03:24:19 TrueNAS kernel: EDAC MC0: 1 CE on mc#0csrow#1channel#0 (csrow:1 channel:0 page:0x11e78cb offset:0x480 grain:64 syndrome:0x20)
Mar 30 03:24:19 TrueNAS kernel: [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD
To me this looks like a RAM ECC error. On both systems Memtest86+ has run for several days without a single error.
Can I safely ignore those errors or is there anything I should do?
Best regards,
AMiGAmann
Last edited: