Aggressive disk activity each 30-60s

joeschmuck · Feb 6, 2024

Well the problem looks to be a configuration error (I hope). Edit your Adv. Power Management and change the "1" which spins the drive down all the time, change it to either "Disabled" or "Level 128". Either should fix your problem. As for the amount of power saved, with Level 128, not much but if you had a laptop, it would make a difference.

Do this for all your drives. If the problem doesn't stop immediately, power down your system and power back on. Report your results.

Zareon · Feb 6, 2024

joeschmuck said:
Well the problem looks to be a configuration error (I hope). Edit your Adv. Power Management and change the "1" which spins the drive down all the time, change it to either "Disabled" or "Level 128". Either should fix your problem. As for the amount of power saved, with Level 128, not much but if you had a laptop, it would make a difference.

Do this for all your drives. If the problem doesn't stop immediately, power down your system and power back on. Report your results.

OK, that seems to fix my problem of big bursts. But that let me with my first problem of small burst of writing / read. I'm gonna make the same thing as Today with one hour of interval, to see if there is some interesting informations to retrieves.
It's very currently late here, so I'm gonna sleep and make these logs tomorrow.

And thank you so much, I'm (finally) gonna work in a quieter environnement aha

joeschmuck · Feb 6, 2024

What setting did you use? If it was not "Disabled" then I would set it to "Disabled" to see if that fixes it. Disabled is the default setting for TrueNAS.

Enjoy the quiet.

Redcoat · Feb 6, 2024

joeschmuck said:
The results: Over 65 minutes the drive spin down and back up 64 times. That is at a rate of one spinup every .9846 seconds (slightly faster than once a second). That is crazy!

Joe, isn't that 1 per minute, not one per second? Matching the OP's reported frequency?

Zareon · Feb 7, 2024

joeschmuck said:
What setting did you use? If it was not "Disabled" then I would set it to "Disabled" to see if that fixes it. Disabled is the default setting for TrueNAS.

I used disabled !
I'm gonna post in few hours the results for my new tests :)

Redcoat said:
Joe, isn't that 1 per minute, not one per second? Matching the OP's reported frequency?

Accordingly to the first part of his sentence, I think that's .9846 minutes, but, actually, I think it's 1.0156 minutes. But that doesn't change a lot aha (only inversed 65 and 64)

joeschmuck · Feb 7, 2024

Redcoat said:
Joe, isn't that 1 per minute, not one per second? Matching the OP's reported frequency?

I was tired at the time, but looking at it after sleeping, absolutely. Once per minute. It looks so obvious right now

Zareon said:
Accordingly to the first part of his sentence, I think that's .9846 minutes, but, actually, I think it's 1.0156 minutes. But that doesn't change a lot aha (only inversed 65 and 64)

Same comment.

Zareon said:
I'm gonna post in few hours the results for my new tests :)

Hope it looks good. I depart for work in an hour from this posting, hopefully the results are in and all looks 100% better. The Start_Stop_Count hopefully does not increase over that hour period.

Zareon · Feb 7, 2024

Ok, so :
The first log said 24555 for Start_Stop_Count.
The second one (is coming in one hour on the same message. Missclick the send button, woops)

joeschmuck · Feb 7, 2024

You had me hanging, LOL. Okay, well hopefully the count is the same but if the drive spins down a few times, it's not the end of the world, you are replacing them.

Last thing, when your new drives come in, I highly recommend you setup SMART tests for a daily Short test and a weekly Long/Extended test. Maybe setup the short test on all drives to start a 1AM everyday, and the long test for all drives to start at 1:15AM on Sunday or Monday (once a week).

Cheers,
-Joe

Zareon · Feb 7, 2024

joeschmuck said:
You had me hanging, LOL. Okay, well hopefully the count is the same but if the drive spins down a few times, it's not the end of the world, you are replacing them.

Last thing, when your new drives come in, I highly recommend you setup SMART tests for a daily Short test and a weekly Long/Extended test. Maybe setup the short test on all drives to start a 1AM everyday, and the long test for all drives to start at 1:15AM on Sunday or Monday (once a week).

Cheers,
-Joe

AAAAAND, that's it, i have the answer...
****drums*****
24555 on the second take !
I still have a fairly recurrent read/write noise (maybe every 30-45s), much lower this time though. It was already there before I upgraded to Cobia.

But maybe it's due to the SMR disks, maybe it'll be sorted out with the change to CMRs!

For S.M.A.R.T. tests, I have a short one once a day, i'll setup an extended one once a week :)
(But they didn't show up any errors here)

Once again (for the 25th time), a very big thank you. At last I can work more peacefully, and that's quite pleasant! It's hard to concentrate with a lawnmower next to your ears!

Zareon · Feb 7, 2024

My two new disks are now up and ready !
But my pool is always tagged as degraded on the dashboard, but I don't see anything when i go to the pool.

Do you have any ideas why ? I ran a manual Short S.M.A.R.T. test after I installed new disks, but always seems to be degraded :(

joeschmuck · Feb 7, 2024

Zareon said:
24555 on the second take !
I still have a fairly recurrent read/write noise (maybe every 30-45s)

While I don't know this for certain, I suspect it is the SMR drive cleaning house. When the drive must change data on an inner track, it needs to change at least 3 tracks of data minimum, but odds are it much more. This is why SMR drives are great for archiving, write once, read many, but not for lots of writing. The drive will store the new data on the drive and when it has the opportunity, it will reorganize and it's a slow process, even slower if the drive has a lot of data. So, if you have the System Dataset on the HDD pool, and this file is written to every 5 minutes (I'm not sure how often for SCALE), then the drive is constantly rewriting data. Anyway I could be completely wrong, it could be something else going on with the drive.

A test if you desire.
1) Power down the computer.
2) Disconnect the Data Cables to the drives, leaving only the power cable.
3) Power up.
4) Listen, is there still that same noise? If yes, if you can leave your system in that state for a few hours to let it clean up, the issue might go away. But the problem is likely to return once you reconnect the data cables and start TrueNAS again.
5) Power off.
6) Reconnect the data cables.
7) Power on and you are done.

Zareon said:
For S.M.A.R.T. tests, I have a short one once a day, i'll setup an extended one once a week :)
(But they didn't show up any errors here)

You are running a Conveyance Test, not a Short Test. Not the same thing although close. As for lack of errors, it means the drives do not recognize any physical errors with the hard drive. A good thing.

Zareon said:
But my pool is always tagged as degraded on the dashboard, but I don't see anything when i go to the pool.

Odds are you have some ZFS errors. Run the commands:
1) zpool status -v Odds are you will see some errors.
2) If you have errors then run zpool scrub Hector_v2 -w and wait for the scrub to complete. The command line will be unusable until the scrub is completed. Remove the "-w" if you want the command prompt back immediately and scrub to continue in the background. I'm saying to use the "-w" so you will know when the scrub is complete, the prompt will return.
3) Run another status check (Step 1).
4) Do you still have errors? If the errors are READ/WRITE/CKSUM value not zero only, no files listed as corrupt and must be deleted then go to step 5, otherwise you will need to delete those files first. Post the output of the status command if you have any questions about what you are doing.
5) We should be able to clear the errors now by running zpool clear Hector_v2 and then perform step 1 again. All should be good. If not, post the results of the above steps.

Let's say we clear the errors. Odds are these will return as long as you are using the SMR drives and writing data.

Why does this happen? This means is the drive is asked for data but if fails to return it fast enough then an error is created. It does not mean your data is corrupt in this situation.

Please understand that I generalized some of the stuff I said above, enough that a person can understand what is basically going on.

Best of luck,
-Joe

Zareon · Feb 7, 2024

joeschmuck said:
1) Power down the computer.
2) Disconnect the Data Cables to the drives, leaving only the power cable.
3) Power up.
4) Listen, is there still that same noise? If yes, if you can leave your system in that state for a few hours to let it clean up, the issue might go away. But the problem is likely to return once you reconnect the data cables and start TrueNAS again.
5) Power off.
6) Reconnect the data cables.
7) Power on and you are done.

Maybe i'll try this if I have the time, but for now, I think i'll let this as it is, until i buy a new disk to replace the last SMR in my pool !

joeschmuck said:
1) zpool status -v Odds are you will see some errors.
2) If you have errors then run zpool scrub Hector_v2 -w and wait for the scrub to complete. The command line will be unusable until the scrub is completed. Remove the "-w" if you want the command prompt back immediately and scrub to continue in the background. I'm saying to use the "-w" so you will know when the scrub is complete, the prompt will return.
3) Run another status check (Step 1).
4) Do you still have errors? If the errors are READ/WRITE/CKSUM value not zero only, no files listed as corrupt and must be deleted then go to step 5, otherwise you will need to delete those files first. Post the output of the status command if you have any questions about what you are doing.
5) We should be able to clear the errors now by running zpool clear Hector_v2 and then perform step 1 again. All should be good. If not, post the results of the above steps.

I was going to TrueNAS to do what you said here, and I saw that the error had gone. Maybe it just needed to quietly do a few scans or something.

Thanks for everything, now that my main problem is solved, I'm going to leave the hardware alone for a while until I change my next disk.

In a few days, I'll be able to upgrade the CPU, Motherboard and RAM, which will be great!

Thank you all for your help, this forum is really superb.

Important Announcement for the TrueNAS Community.

Aggressive disk activity each 30-60s

joeschmuck

Old Man

Zareon

Dabbler

joeschmuck

Old Man

Redcoat

MVP

Zareon

Dabbler

joeschmuck

Old Man

Zareon

Dabbler

Attachments

joeschmuck

Old Man

Zareon

Dabbler

Attachments

Zareon

Dabbler

joeschmuck

Old Man

Zareon

Dabbler

Similar threads

Important Announcement for the TrueNAS Community.

Aggressive disk activity each 30-60s

Old Man

Dabbler

Old Man

MVP

Dabbler

Old Man

Dabbler

Attachments

Old Man

Dabbler

Attachments

Dabbler

Old Man

Dabbler

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "Aggressive disk activity each 30-60s"

Similar threads