Truenas Scale 23.10.2: ZFS CKSUM Errors combined with a non clearable pool

Krautmaster

Explorer
Joined
Apr 10, 2017
Messages
81
Dear folks, i have some very odd problems with my truenas scale setup.

All HDD are running on CKSUM erros in Pools, giving me the message that the pool is unhealthy. All SSDs are fine.
The values are always the same for all disks in one pool. See:

1710337691093.png


What I did:
  • swiched the 120W Pico PSU to a 300W gold Server PSU
  • new power cables
  • new / different sata cables
  • I have 3 ASM1116 controllers here, all have the issue and even the onboard connectors
  • tried disabling SMART, power settings of the disks,

Historical:
  • I think i had not the issue with my old setup and an LSI 2116 controller, but it could also be some HDD feature which was not active for it
  • wonder if the HDD cages causing this but i have a single disk stripe attached to a sata cable and even this disk has the same problem

HW is a N100 intel with 32GB memory. But having a general stability issue would result in checksum errors on SSD as well I guess. Memtest is also fine.

This leads to my second, odd problem:

I can scrub the pool with 0 errors. See screenshot. Data itself is fine

zpool status -x RaidZ => tells me 2 erros on pool
zpool status -v RaidZ => does not work at all (wtf?)
zpool clear RaidZ => has no effect

1710338356380.png


I'd completely resetup that pool but I dont have space to backup all data.

Thanks for your ideas and help
 
Top