Hi All,
Running latest version of Truenas.. dell r720xd 96GB Ram. 12 6TB SAS, and one 1TB nvme for cache. Using a h310 mini HBA flashed. E5-2690 CPU
I keep having drives "fail" or become degraded. I would run zpool status and see lots of read/write on a particular drive. If I go to clear that error from just that drive it'll be fine then show back up on that same drive (This happens to any drive).... Then If I go to replace the drive as it's resilvering every other drive will throw errors, and when the pool is done with the resilver another drive will "fail" if I leave it like that the whole pool starts to throw read/write errors. Any ideas? I'm at loss for words there's no way the drives are failing at that rate (One every day)? i have an Identical nas and the drives that I ordered for both were all from the same batch.
(Drives were used btw... I know it's not recommended but they're 5x the price new)
Could it be the h310 mini that's failing? That's only thing I can think of that would cause so many drives to "fail"
What should I do I can't keep ordering more drives it's very expensive.
Thanks!
Bellow is the error from this morning after replacing a drive yesterday and waiting for the resilver to finish..
This is after running zpool clear Pool
In about 20min another drive will start to show errors...
	
		
			
		
		
	
			
			Running latest version of Truenas.. dell r720xd 96GB Ram. 12 6TB SAS, and one 1TB nvme for cache. Using a h310 mini HBA flashed. E5-2690 CPU
I keep having drives "fail" or become degraded. I would run zpool status and see lots of read/write on a particular drive. If I go to clear that error from just that drive it'll be fine then show back up on that same drive (This happens to any drive).... Then If I go to replace the drive as it's resilvering every other drive will throw errors, and when the pool is done with the resilver another drive will "fail" if I leave it like that the whole pool starts to throw read/write errors. Any ideas? I'm at loss for words there's no way the drives are failing at that rate (One every day)? i have an Identical nas and the drives that I ordered for both were all from the same batch.
(Drives were used btw... I know it's not recommended but they're 5x the price new)
Could it be the h310 mini that's failing? That's only thing I can think of that would cause so many drives to "fail"
What should I do I can't keep ordering more drives it's very expensive.
Thanks!
Bellow is the error from this morning after replacing a drive yesterday and waiting for the resilver to finish..
Code:
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 10.4G in 00:43:48 with 0 errors on Wed Jul 21 21:13:01 2021
config:
    NAME                                            STATE     READ WRITE CKSUM
    Tank2                                           DEGRADED     0     0     0
      raidz3-0                                      DEGRADED     0     0     0
        gptid/e9380b23-e630-11eb-8b55-ecf4bbc0e684  ONLINE     337 27.9K     0
        gptid/ea1e756a-e630-11eb-8b55-ecf4bbc0e684  ONLINE      50 3.40K     0
        gptid/eac53908-e630-11eb-8b55-ecf4bbc0e684  ONLINE      12 39.2K     0
        gptid/eab4836c-e630-11eb-8b55-ecf4bbc0e684  ONLINE     460 17.8K     0
        gptid/ed614cef-e960-11eb-b636-ecf4bbc0e684  ONLINE     105 31.1K     0
        gptid/ebf41b72-e630-11eb-8b55-ecf4bbc0e684  ONLINE      25 6.14K     0
        gptid/eb760390-e630-11eb-8b55-ecf4bbc0e684  ONLINE      80 44.6K     0
        gptid/ebbab4d7-e630-11eb-8b55-ecf4bbc0e684  ONLINE     391 28.5K     0
        gptid/ec7d62a3-e630-11eb-8b55-ecf4bbc0e684  DEGRADED    38 36.8K   265  too many errors
        gptid/ddf08d31-ea29-11eb-b636-ecf4bbc0e684  ONLINE       0     0    43
        gptid/eca02a58-e630-11eb-8b55-ecf4bbc0e684  ONLINE      21 9.13K     0
        gptid/aa75b32a-e665-11eb-8b55-ecf4bbc0e684  ONLINE      98 48.4K     0
    cache
      gptid/e9369611-e630-11eb-8b55-ecf4bbc0e684    ONLINE       0     0     0
errors: No known data errors
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:01:59 with 0 errors on Fri Jul 16 03:47:00 2021
config:
    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      da12p2    ONLINE       0     0     0
This is after running zpool clear Pool
Code:
 state: ONLINE
  scan: resilvered 10.4G in 00:43:48 with 0 errors on Wed Jul 21 21:13:01 2021
config:
    NAME                                            STATE     READ WRITE CKSUM
    Tank2                                           ONLINE       0     0     0
      raidz3-0                                      ONLINE       0     0     0
        gptid/e9380b23-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/ea1e756a-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/eac53908-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/eab4836c-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/ed614cef-e960-11eb-b636-ecf4bbc0e684  ONLINE       0     0     0
        gptid/ebf41b72-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/eb760390-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/ebbab4d7-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/ec7d62a3-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/ddf08d31-ea29-11eb-b636-ecf4bbc0e684  ONLINE       0     0     0
        gptid/eca02a58-e630-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
        gptid/aa75b32a-e665-11eb-8b55-ecf4bbc0e684  ONLINE       0     0     0
    cache
      gptid/e9369611-e630-11eb-8b55-ecf4bbc0e684    ONLINE       0     0     0
errors: No known data errors
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:01:59 with 0 errors on Fri Jul 16 03:47:00 2021
config:
    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      da12p2    ONLINE       0     0     0
errors: No known data errors
In about 20min another drive will start to show errors...