I had an existing group-of-mirrors array with 2 vdevs each of 2x8T drives, which I have recently moved into my new Truenas box. While I did have some issues the array did move ok and I can see the filesystem. I then added a new pair of new 8T drives as a third vdev, but during the process one of the old drives got inadvertently disconnected. The resilver of that mirror is what I'm asking about.
Running
and:
and the date (here Mon Dec 18 20:55...) is constantly updated to almost-now.
I am currently interpreting this as the resilver starts and then more-or-less immediately stops for unexplained reasons. Does that make sense? Following the process with
I have run smartctl long selftest on all the disks in the system and all are reporting themselves healthy with no errors. I also ran a scrub on the drive before this happened and again everything was reported fine.
I can see no messages (at all) in the kernel log or elsewhere, and am a bit stumped. Any ideas?
Would it be a good idea
Running
watch zpool status
shows the pool alternating between: scan: resilvered 1.93M in 00:00:01 with 0 errors on Mon Dec 18 20:55:23 2023
and:
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Mon Dec 18 20:55:28 2023
337M / 11.3T scanned, 1.88M / 11.3T issued at 1.88M/s
1.93M resilvered, 0.00% done, no estimated completion time
scan warning: skipping blocks that are only referenced by the checkpoint.
and the date (here Mon Dec 18 20:55...) is constantly updated to almost-now.
I am currently interpreting this as the resilver starts and then more-or-less immediately stops for unexplained reasons. Does that make sense? Following the process with
zpool iostat
I can see a burst of activity (e.g. a few kilobytes or a megabyte) than nothing, repeated every 5 seconds or so.I have run smartctl long selftest on all the disks in the system and all are reporting themselves healthy with no errors. I also ran a scrub on the drive before this happened and again everything was reported fine.
I can see no messages (at all) in the kernel log or elsewhere, and am a bit stumped. Any ideas?
Would it be a good idea