I have a strange situation with my storage array.
Latest TrueNAS stable build running on a DELL server with a SAS LBA connected to a 24 x 2.5" JBOD enclosure.
It is populated with 22 x 1Tb SSD's over two pools. SSD's are all Samsung 850 / 860 / 870 in one pool and Kingston in the other pool.
Pools are made up of 2 drive mirrors.
The "Samsung pool" has started to experience regular drive failures which present either read, write or checksum errors and eventually the drive goes offline.
I started to replace the failed disks initially but realised that removing and reinserting the drive or rebooting the trueNAS would bring them all back online, before another drive would display the same symptoms.
It is now happening about 4 times a week that I will lose a drive. The drives are completely random and at first I thought it was because they were a few years old but I have replaced some with new 2Tb 870 SSD's in the past few weeks and some of them are failing too in the same way.
There is no pattern to it other than they are all on the same pool and the other pool which is on the same JBOD enclosure are unaffected. So I don't think it's a connectivity problem to the enclosure.
Smart tests come back clean on the failed drives.
I guess I'm look for some hints on next troubleshooting steps?
thanks for any suggestions.
Latest TrueNAS stable build running on a DELL server with a SAS LBA connected to a 24 x 2.5" JBOD enclosure.
It is populated with 22 x 1Tb SSD's over two pools. SSD's are all Samsung 850 / 860 / 870 in one pool and Kingston in the other pool.
Pools are made up of 2 drive mirrors.
The "Samsung pool" has started to experience regular drive failures which present either read, write or checksum errors and eventually the drive goes offline.
I started to replace the failed disks initially but realised that removing and reinserting the drive or rebooting the trueNAS would bring them all back online, before another drive would display the same symptoms.
It is now happening about 4 times a week that I will lose a drive. The drives are completely random and at first I thought it was because they were a few years old but I have replaced some with new 2Tb 870 SSD's in the past few weeks and some of them are failing too in the same way.
There is no pattern to it other than they are all on the same pool and the other pool which is on the same JBOD enclosure are unaffected. So I don't think it's a connectivity problem to the enclosure.
Smart tests come back clean on the failed drives.
I guess I'm look for some hints on next troubleshooting steps?
thanks for any suggestions.