FreeNAS running slow, high CPU & HDD usage and I have no idea why

Status
Not open for further replies.

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
I'm very new to FreeNAS. I have a server setup and running configured with Plex Media Server & SMB sharing. Every so often the server runs very slowly for a few days and then it's fine. When it's running slow the HDD & CPU usage are pretty much constantly at 100% but I have no idea how to check what's chewing through the resources. Also, the HDD light on the machine is just lit constantly as well when it's running slowly. I'm sorry for my ignorance and I'd be open to anything anyone has to suggest.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
Exact Parallel. Good call. I'm scrubbing my main raid monthly around the same time as the slow down. The raid is a 5 x 6TB ZFS. The scrub seems to take days though. Is that normal? During this time, the server is pretty much unusable. Is there anything I can do to help this? Do I need to run the scrub monthly?
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
Which FreeNAS version are you running? In FreeNAS 11.1 ZFS scrubbing/resilvering was improved substantially. Which other hardware components (motherboard, CPU, memory, HBA, ...)?

The improvements in 11.1 together with a scrub schedule to run it over the weekend (or whichever time might be convenient) might help. Running a scrub two times per months seems to be rather usual among forum participants.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
One more question: Are you seeing some swap usage during/after each scrub?
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
How full is your storage pool? The scrub only goes over the data, not the free-space, so the more data you have, the longer the scrub will take.
Is the pool RAIDx2? More vdevs in the pool will make the overall performance of the pool faster. I have a pool that completes a scrub in about 2.5 hours and another with the same amount of data that takes about 5 hours. The difference is that the faster pool has 2 vdevs where the slower pool has only 1... In very general terms, more vedevs = more performance.

Here are some guides that might help you understand the terminology and enhance your understanding:

Slideshow explaining VDev, zpool, ZIL and L2ARC
https://forums.freenas.org/index.ph...ning-vdev-zpool-zil-and-l2arc-for-noobs.7775/

Terminology and Abbreviations Primer
https://forums.freenas.org/index.php?threads/terminology-and-abbreviations-primer.28174/
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Is there anything I can do to help this?
It depends on a few factors. Can you give a full rundown on the hardware you are using?
Also, will you share with us the output of zpool list and zpool status and enclose those in code blocks?
 

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
Which FreeNAS version are you running? In FreeNAS 11.1 ZFS scrubbing/resilvering was improved substantially. Which other hardware components (motherboard, CPU, memory, HBA, ...)?

The improvements in 11.1 together with a scrub schedule to run it over the weekend (or whichever time might be convenient) might help. Running a scrub two times per months seems to be rather usual among forum participants.

The hardware is just a smattering of parts I had laying around so, not anything high end. Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz, 8GB RAM, Asus Motherboard (not sure on the model). I am running FreeNAS 11.1-U1. Rescheduling is the scrub is tough as it's a home server so there's really not a time when it's not in use (weekend is probably it's busiest time). I currently have it starting it's run at 2am on the first Monday of every month but since it runs for like 3 days, it means that there is some time it's running when I'd rather it doesn't. Good suggestions though.
 

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
It depends on a few factors. Can you give a full rundown on the hardware you are using?
Also, will you share with us the output of zpool list and zpool status and enclose those in code blocks?
I only have the one zpool so I won't bother posting the list but below is the status. And the the storage space used is at about 60% full.
Code:
  pool: Main																														
 state: ONLINE																													
status: Some supported features are not enabled on the pool. The pool can														  
	   still be used, but some features are unavailable.																		  
action: Enable all features using 'zpool upgrade'. Once this is done,															  
	   the pool may no longer be accessible by software that does not support													
	   the features. See zpool-features(7) for details.																			
  scan: scrub repaired 0 in 1 days 13:20:27 with 0 errors on Tue May  8 13:24:28 2018											  
config:																															
																																  
	   NAME											STATE	 READ WRITE CKSUM												
	   Main											ONLINE	   0	 0	 0												
		 raidz1-0									  ONLINE	   0	 0	 0												
		   gptid/121d2c15-dded-11e6-a75e-002618a367c9  ONLINE	   0	 0	 0												
		   gptid/12cda8b4-dded-11e6-a75e-002618a367c9  ONLINE	   0	 0	 0												
		   gptid/13843963-dded-11e6-a75e-002618a367c9  ONLINE	   0	 0	 0												
		   gptid/1434dca4-dded-11e6-a75e-002618a367c9  ONLINE	   0	 0	 0												
		   gptid/14e0c276-dded-11e6-a75e-002618a367c9  ONLINE	   0	 0	 0												
																																  
errors: No known data errors	
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I only have the one zpool so I won't bother posting the list but below is the status.
The zpool list shows total space and the percentage of free and used space in the pool. It is good information to have when troubleshooting. If it isn't too much trouble. The duration of the scan is affected by the amount of data after all.
 

MrToddsFriends

Documentation Browser
Joined
Jan 12, 2015
Messages
1,338
Not a clue. How would I go about checking that?

When looking at the Reporting/Memory graphs are you able to see some swap usage? Is a temporal correlation apparent between starting a scrub (in particular the first one after a reboot) and start of swap usage?
 

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
When looking at the Reporting/Memory graphs are you able to see some swap usage? Is a temporal correlation apparent between starting a scrub (in particular the first one after a reboot) and start of swap usage?
Looking at the graph, the Swap is never used.
 

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
The zpool list shows total space and the percentage of free and used space in the pool. It is good information to have when troubleshooting. If it isn't too much trouble. The duration of the scan is affected by the amount of data after all.
Code:
Main		  27.2T  18.1T  9.17T		 -	23%	66%  1.00x  ONLINE  /mnt
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
You might want to move to the latest version of FreeNAS, (FreeNAS-11.1-U4) and upgrade your pool:
Code:
status: Some supported features are not enabled on the pool. The pool can														   
	   still be used, but some features are unavailable.																		   
action: Enable all features using 'zpool upgrade'. Once this is done,															   
	   the pool may no longer be accessible by software that does not support													 
	   the features. See zpool-features(7) for details.																			

One of the new features in FreeNAS is that you can set a "Resilver Priority":
http://doc.freenas.org/11/storage.html#resilver-priority
Since the same code is used for scrub that is used for resilver, I would hope there is some relationship between setting this priority and the speed of a scrub, but that is purely theoretical... Might be worth a try. If you lower the priority, it might make the scrub take longer, but have the system be more usable during the scrub. Scrubs are needed to check for, and potentially correct, corruption in the data.
This right here:
Code:
Main		 27.2T  18.1T  9.17T		 -	23%	66%  1.00x  ONLINE  /mnt

It shows that your pool is 66% filled, so 3 days is probably about right for that amount of data (18TB) and the only thing you would definitly make it faster, is to make a new pool with more vdevs and move the data over to it... I imagine that (for the amount of data) this is actually performing pretty well. I have a server at work that is using 6TB drives and is only 39% capacity and it takes around 3 days also. It is not configured for speed though. It is configured for maximum capacity and uses gzip compression to get the maximum data into the space available.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz, 8GB RAM
PS. I am not telling you that you must upgrade, but you would very likely get better results with more modern hardware here.
For the amount of disk space you have, the rule of thumb would have you with 32GB of system memory and with as full as your pool is, you should be considering adding more storage also.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
Running a scrub two times per months seems to be rather usual among forum participants.
If I recall correctly, I have my scrub (which completes in just a few hours) scheduled to run once a week, while I am at work. Each pool runs the scrub on a different day so I get an email telling me it happened but it never interferes with my use of the NAS.
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
PPS. This pool is made with 6TB drives and it is RAIDz1. It is highly recommended to use RAIDz2 (or better) when using drives larger than 1TB and that has been the guidance for as long as I can remember due to the possibility of a second drive failing during the replacement of a failed hard drive. As long as it takes to run a scrub, that is how long it would take to rebuild the pool if you have to replace a drive and during the rebuild process, your pool would be completely unprotected from an additional drive fault. This isn't as much a concern (perhaps) when drives are new, but when they start to fail from old age, they tend to go around the same time.
In the latter part of last year (2017) I replaced all 12 drives (one at a time) in my primary storage pool because the drives that were in it had gotten to be over 5 years of age and they were starting to go at regular intervals. It was one a month, from natural causes, for a while and then I had two go at the same time and I decided to bite the bullet and get the rest all at once. Just something to think about.
 

jeffmnelson

Dabbler
Joined
May 8, 2018
Messages
11
PPS. This pool is made with 6TB drives and it is RAIDz1. It is highly recommended to use RAIDz2 (or better) when using drives larger than 1TB and that has been the guidance for as long as I can remember due to the possibility of a second drive failing during the replacement of a failed hard drive. As long as it takes to run a scrub, that is how long it would take to rebuild the pool if you have to replace a drive and during the rebuild process, your pool would be completely unprotected from an additional drive fault. This isn't as much a concern (perhaps) when drives are new, but when they start to fail from old age, they tend to go around the same time.
In the latter part of last year (2017) I replaced all 12 drives (one at a time) in my primary storage pool because the drives that were in it had gotten to be over 5 years of age and they were starting to go at regular intervals. It was one a month, from natural causes, for a while and then I had two go at the same time and I decided to bite the bullet and get the rest all at once. Just something to think about.

Ah, the constant battle between performance, redundancy & budget. Thanks so much for all of your feedback. I totally agree with your assessment and in a perfect world the server would have the extra drive and RAM but I had to choose between that and staying married so it was a judgement call. Instead, I have the important, irreplaceable stuff form the server backed up elsewhere and if the array craps out and I loose the rest, it'll be a pain but not catastrophic.

Thanks again for the feedback. It was so helpful.
 

diskdiddler

Wizard
Joined
Jul 9, 2014
Messages
2,377
Which FreeNAS version are you running? In FreeNAS 11.1 ZFS scrubbing/resilvering was improved substantially. Which other hardware components (motherboard, CPU, memory, HBA, ...)?

The improvements in 11.1 together with a scrub schedule to run it over the weekend (or whichever time might be convenient) might help. Running a scrub two times per months seems to be rather usual among forum participants.


I giddily thought this was the case, the first time I scrubbed under 11, but nah, it's still atrociously slow

" scan: scrub repaired 0 in 1 days 04:19:39 with 0 errors on Thu May 3 03:20:44 2018"

.....
 
Status
Not open for further replies.
Top