Hi guys.
configuration:
Asrock X570D4U-2L2T
8 sata ssd - 4tb samsung 860 pro (no raid controller)
this truenas core server is a shared storage for 2 xcp-ng servers (connected via NFS) since october 2019
by default smart check is enabled and does at 0:00 sunday.
today one of ssd's (ada7) smart couldn't be done and the zfs (raidz2) went degraded.
the worst thing is that all VMs on xcp-ng server couldn't perform I/O operations (I/O error, filesystem is RO in VM's console).
so i rebooted the truenas server and checked smart of the faulty disk. the smart is OK. No errors, all the stats are the same as on other disks.
cleared error with zpool clear and the array is online. everything is working.
i have to mension that 1 year ago the same exact thing happened to another disk (ada6). That happened the 1st time and we replaced the disk (but actually upon bringing the disk home and checking it - smart was OK). And the same thing happened to VMs (I/O errors).
So i guess on raidz2 (which can loose 2 disks and still remain online), shouldn't happen something like in our situation with NFS.
For now i just deleted the smart test under Tasks menu.
configuration:
Asrock X570D4U-2L2T
8 sata ssd - 4tb samsung 860 pro (no raid controller)
this truenas core server is a shared storage for 2 xcp-ng servers (connected via NFS) since october 2019
by default smart check is enabled and does at 0:00 sunday.
today one of ssd's (ada7) smart couldn't be done and the zfs (raidz2) went degraded.
Device: /dev/ada7, Read SMART Error Log Failed.
Device: /dev/ada7, Read SMART Self-Test Log Failed.
Device: /dev/ada7, failed to read SMART Attribute Data.
Device: /dev/ada7, not capable of SMART self-check.
Pool ztank state is DEGRADED: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state.
The following devices are not healthy:
Disk Samsung SSD 860 PRO 4TB S5G9NC0******* is REMOVED
the worst thing is that all VMs on xcp-ng server couldn't perform I/O operations (I/O error, filesystem is RO in VM's console).
so i rebooted the truenas server and checked smart of the faulty disk. the smart is OK. No errors, all the stats are the same as on other disks.
cleared error with zpool clear and the array is online. everything is working.
i have to mension that 1 year ago the same exact thing happened to another disk (ada6). That happened the 1st time and we replaced the disk (but actually upon bringing the disk home and checking it - smart was OK). And the same thing happened to VMs (I/O errors).
So i guess on raidz2 (which can loose 2 disks and still remain online), shouldn't happen something like in our situation with NFS.
For now i just deleted the smart test under Tasks menu.