24555 on the second take !
I still have a fairly recurrent read/write noise (maybe every 30-45s)
While I don't know this for certain, I suspect it is the SMR drive cleaning house. When the drive must change data on an inner track, it needs to change at least 3 tracks of data minimum, but odds are it much more. This is why SMR drives are great for archiving, write once, read many, but not for lots of writing. The drive will store the new data on the drive and when it has the opportunity, it will reorganize and it's a slow process, even slower if the drive has a lot of data. So, if you have the System Dataset on the HDD pool, and this file is written to every 5 minutes (I'm not sure how often for SCALE), then the drive is constantly rewriting data. Anyway I could be completely wrong, it could be something else going on with the drive.
A test if you desire.
1) Power down the computer.
2) Disconnect the Data Cables to the drives, leaving only the power cable.
3) Power up.
4) Listen, is there still that same noise? If yes, if you can leave your system in that state for a few hours to let it clean up, the issue might go away. But the problem is likely to return once you reconnect the data cables and start TrueNAS again.
5) Power off.
6) Reconnect the data cables.
7) Power on and you are done.
For S.M.A.R.T. tests, I have a short one once a day, i'll setup an extended one once a week :)
(But they didn't show up any errors here)
You are running a Conveyance Test, not a Short Test. Not the same thing although close. As for lack of errors, it means the drives do not recognize any physical errors with the hard drive. A good thing.
But my pool is always tagged as degraded on the dashboard, but I don't see anything when i go to the pool.
Odds are you have some ZFS errors. Run the commands:
1)
zpool status -v
Odds are you will see some errors.
2) If you have errors then run
zpool scrub Hector_v2 -w
and wait for the scrub to complete. The command line will be unusable until the scrub is completed. Remove the "-w" if you want the command prompt back immediately and scrub to continue in the background. I'm saying to use the "-w" so you will know when the scrub is complete, the prompt will return.
3) Run another status check (Step 1).
4) Do you still have errors? If the errors are READ/WRITE/CKSUM value not zero only, no files listed as corrupt and must be deleted then go to step 5, otherwise you will need to delete those files first. Post the output of the status command if you have any questions about what you are doing.
5) We should be able to clear the errors now by running
zpool clear Hector_v2
and then perform step 1 again. All should be good. If not, post the results of the above steps.
Let's say we clear the errors. Odds are these will return as long as you are using the SMR drives and writing data.
Why does this happen? This means is the drive is asked for data but if fails to return it fast enough then an error is created. It does not mean your data is corrupt in this situation.
Please understand that I generalized some of the stuff I said above, enough that a person can understand what is basically going on.
Best of luck,
-Joe