Periodically high cpu cost

flymin

Dabbler
Joined
Oct 8, 2022
Messages
10
Hi,
I observe periodically high cpu costs on my TrueNAS scale system (22.12-beta.1). Is this the normal case? I check by `htop` as shown below:
1665283311596.png

Does somebody know what the process (dockerd -H fd://) concerns and is it possible to disable them?
 

flymin

Dabbler
Joined
Oct 8, 2022
Messages
10
Update:
I do notice that this is related to docker/docker-socket service. If I totally stop docker, there is no such situation. However, without docker, I cannot start my truecharts apps. I think this is a kubernetes-based system. Why do all apps actually run in docker?
I do observe some errors from the docker status, I think I can get rid of them with TrueTools. However, there are still some periodical jobs with docker.
1665325173933.png

Why do we need these jobs running up and down all the time?
 

shimian5

Dabbler
Joined
Dec 16, 2014
Messages
43
I am having the exact same issue as you. Were you able to pinpoint how to get this under control?
 

shimian5

Dabbler
Joined
Dec 16, 2014
Messages
43
I opened a ticket for this, as I cannot figure out the resolution to it. I don't see any obvious issues, the issue even occurs with very few containers running.

 

c77dk

Patron
Joined
Nov 27, 2019
Messages
468
As you stated the system is using k8s - but this also means you need a CRI, which in this case is Docker. So kill Docker, and you kill the CRI.
Take a closer look at the PODs running and see if you are able to apply some resource-limits on them, if you think they use too much cpu
 

shimian5

Dabbler
Joined
Dec 16, 2014
Messages
43
As you stated the system is using k8s - but this also means you need a CRI, which in this case is Docker. So kill Docker, and you kill the CRI.
Take a closer look at the PODs running and see if you are able to apply some resource-limits on them, if you think they use too much cpu
this issue occurs even with no or few pods running. Just starting one pod causes dockerd to consume 600-1000% of CPU for short bursts.
 

sstruke

Dabbler
Joined
Feb 2, 2017
Messages
37
TrueNAS SCALE Bluefin is not even possible to test because of such a high CPU consumption, they messed something up !!!:mad::confused:o_O
All the changes I tried didn't help at all.
 

shimian5

Dabbler
Joined
Dec 16, 2014
Messages
43
I was told it was a known issue in my Jira ticket but not offered an explanation as to why it's so vastly different from 22.02. They told me it had to do with the docker zfs driver but to my knowledge 22.02 uses the same setup... so yeah I dont know.
 

espenfjo

Cadet
Joined
Aug 15, 2022
Messages
9
22.12 is impacted because of an updated k3s. Kubernetes and k3s has decided to move away from dockershim as their main container backend, which you can start learning about by heading here.

This has resulted in several changes needed for how k3s/k8s and zfs interacts which each other, and it seems like they (IX) have landed on using overlayfs as a base instead of zfs datasets and snapshots like it was on 22.02. It is a bumpy road right now, but as you can see from the Jira and github issues linked in my previous post they are on top of it and it will likely/hopefully be solved by RC1.

If you want to dabble with the BETAs as they are right now you could probably download k3s v1.23.something from the k3s github and use that instead of the bundled version (1.24). This works.. kinda.. at least its better than nothing right now.

It is a bit weird seeing how angry you guys are while using a BETA version. Its marked as BETA for a reason... ... ...
 

flymin

Dabbler
Joined
Oct 8, 2022
Messages
10
If you want to dabble with the BETAs as they are right now you could probably download k3s v1.23.something from the k3s github and use that instead of the bundled version (1.24). This works.. kinda.. at least its better than nothing right now.
May I know whether the change of k3s version breaks system updates? I do not mind changing again after updates (if needed), only expect the update itself won't fail.

Update:
So I just replace /usr/local/bin/k3s with the pre-built version of k3s executable file from https://github.com/k3s-io/k3s/releases ? I tried several versions, but the service won't start.
 
Last edited:

jenksta

Dabbler
Joined
Sep 4, 2022
Messages
10
TrueCharts catalog refresh thrashes CPU, is it that?
 

flymin

Dabbler
Joined
Oct 8, 2022
Messages
10
TrueCharts catalog refresh thrashes CPU, is it that?
Not the case. TrueCharts catalog refresh uses multi-threads, and the high cost seems reasonable. The problem above not only costs high cost every few seconds but also eats my cpu by day (after three days without rebooting, the cpu assumption will stay at 100%)
1666340781021.png
 

flymin

Dabbler
Joined
Oct 8, 2022
Messages
10
Following this ...
If you want to dabble with the BETAs as they are right now you could probably download k3s v1.23.something from the k3s github and use that instead of the bundled version (1.24). This works.. kinda.. at least its better than nothing right now.
I am now on k3s version "v1.23.12+k3s1 (66309a8e) go version go1.17.13" and everything seems fine. As the cpu usage across one day below: 1666601655453.png
The only issue is that I do not know how to perform the hot replacement. As I posted above, I changed the version, but my applications did not start. So I tried to change back, but I got no luck on this. Eventually, I reset everything related to Kubernetes over again (removing all images and starting from choosing a new pool) and it just works now.

One more question: I do not understand how truenas scale handles HTTP proxy in applications. The expected behavior should be apps using the system-wide proxy or not. I experience some network (proxy) issues in apps after changing to v1.23.12 and am not sure about the causes.
 

danielromney

Cadet
Joined
Oct 26, 2022
Messages
2
I am having the same issue and I have this installed on 3 separate systems.
I have a Zimaboard that I am using purely for test purposes and 2 Poweredge R230 systems.
All three exhibit this same issue with periodic spikes in CPU usage, which also effects operating temperature. After updating to Bluefin my systems are all running 5-10 degrees hotter (depending on the system)
I have also seen that it is the dockerd service that is causing the spikes. I see it about every 8-10 seconds.
This was not happening on Angelfish.
 

HootleTootle

Cadet
Joined
Nov 12, 2022
Messages
3
Same issue here too, on a Ryzen V1500B system. CPU spikes to 100% and 80C+. Seems to be something in Docker doing it.
 

Glowtape

Dabbler
Joined
Apr 8, 2017
Messages
45
22.12 is impacted because of an updated k3s. Kubernetes and k3s has decided to move away from dockershim as their main container backend, which you can start learning about by heading here.

This has resulted in several changes needed for how k3s/k8s and zfs interacts which each other, and it seems like they (IX) have landed on using overlayfs as a base instead of zfs datasets and snapshots like it was on 22.02. It is a bumpy road right now, but as you can see from the Jira and github issues linked in my previous post they are on top of it and it will likely/hopefully be solved by RC1.

Oh dear.
 

flymin

Dabbler
Joined
Oct 8, 2022
Messages
10
I think RC.1 solved the problem for me. Although the cpu usage is still a little higher than when I was using k3s.v1.23.12+k3s1, I'd like to show a credit to the developers.
1668821932282.png
 

danielromney

Cadet
Joined
Oct 26, 2022
Messages
2
Same for me as well. High CPU usage seems to be solved and my servers are all running considerably cooler again. 10 degree C difference, at times even more.
 
Top