1 SMBD thread stuck at 100% CPU

truenasuserh

Cadet
Joined
Sep 29, 2020
Messages
9
I have a single SMBD thread continiously running at 100%. The CPU load is not related to the SMB share activity/load. Disabling and re-enabling the SMB service does not solve the issue.

So far the only way I was able to solve the issue was a system reboot of TrueNAS. After reboot the issue re-occurs after aproximately 1 week. I've had the same several times before on TrueNAS beta 1 and beta 2.

I am running TrueNAS-12.0-RC1 on a system with 16 CPU threads.

Can anybody help me out on how to debug this issue based on the PID of the specific SMBD thread?
 

mbender71

Cadet
Joined
Oct 5, 2020
Messages
1
I was happy (though sad) to see that someone else is experiencing the same exact problem that I've been seeing for a couple of weeks. I am running TrueNAS-12.0-RC1 as well on a system with 8 CPU threads and 128GB of ram. Going to services and restarting SMB doesn't get rid of this smbd process that is running at 100%, but a good ole' kill command does. Hopefully someone will be able to shed some light on possible causes.

Thanks in advance!
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
I was happy (though sad) to see that someone else is experiencing the same exact problem that I've been seeing for a couple of weeks. I am running TrueNAS-12.0-RC1 as well on a system with 8 CPU threads and 128GB of ram. Going to services and restarting SMB doesn't get rid of this smbd process that is running at 100%, but a good ole' kill command does. Hopefully someone will be able to shed some light on possible causes.

Thanks in advance!
If it happens again kill -6 the process (which will generate a core file under /var/db/system/samba4 ) then file a bug ticket with it.
 

truenasuserh

Cadet
Joined
Sep 29, 2020
Messages
9
I have reported this under NAS-107854.

Kill -6 did not seem to work (or the core file ended up somewhere else), so I ended up using gcore.
 

truenasuserh

Cadet
Joined
Sep 29, 2020
Messages
9
Okay, is that the same as gcore or is it still of any added value to upload that one to the issue, next time it occurs?
 

truenasuserh

Cadet
Joined
Sep 29, 2020
Messages
9
It seems that kill -6 does not create a core file there either. I'll try -3 next time I've got a 100% cpu process.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
Samba was stuck in a tight loop in assert_no_pending_aio() (decrementing counter for pending aio requests until it reached 0 by calling TALLOC_FREE(fsp->aio_requests[0])). The destructor for fsp->aio_requests[0] puts another request into fsp->aio_requests[0], but this was being overwritten by TALLOC_FREE(). End result was samba stopped decrementing counter for pending aio requests and got stuck in tight loop.
 
Joined
Oct 27, 2020
Messages
2
Well it seams that there is still an issue with 12 release version. I've dot 9 smbd processes maxed out today. VM with TrueNAS was not responsive at all so I had to kill the VM. After reboor I saw one smbd process going rouge three times (still monitoring in cli using htop)
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
Well it seams that there is still an issue with 12 release version. I've dot 9 smbd processes maxed out today. VM with TrueNAS was not responsive at all so I had to kill the VM. After reboor I saw one smbd process going rouge three times (still monitoring in cli using htop)
Okay. Maybe a different situation than the one I found. Can you `kill -6` one of the processes stuck at 100% CPU and send me a PM with a zipped up core (/var/db/system/cores).
 

Jon Moog

Dabbler
Joined
Apr 24, 2017
Messages
21
Tracked down issue. Should be fixed for 12.0 release.
I'll see your one smbd process stuck at 100 percent and raise you 31 more... On 12 Release I was seeing the 100 percent for an smbd process and ignored it as smb was working (mostly Time Machine backups) fine as far as I could tell. A couple of days later in rather short order 31 other smbd processes showed up at 100 percent (or as much as they could get) and the service went dead or slow enough to be mistaken for dead. The only way I was able to recover was killall -9 smbd as the GUI failed to shut down the service. I haven't seen it again since then fortunately. This is a machine with 32 threads FWIW.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
I'll see your one smbd process stuck at 100 percent and raise you 31 more... On 12 Release I was seeing the 100 percent for an smbd process and ignored it as smb was working (mostly Time Machine backups) fine as far as I could tell. A couple of days later in rather short order 31 other smbd processes showed up at 100 percent (or as much as they could get) and the service went dead or slow enough to be mistaken for dead. The only way I was able to recover was killall -9 smbd as the GUI failed to shut down the service. I haven't seen it again since then fortunately. This is a machine with 32 threads FWIW.
Can you PM me a debug please (system->advanced->save debug)?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
In the future, you can `kill -6` one of the processes and send me a core file. This will help me to determine where it's stuck (do note that this is a rather drastic step so only one will be needed).
 

Grubster

Cadet
Joined
Nov 19, 2020
Messages
3
Did this get resolved?
I‘m getting 100% CPU in smbd on TrueNAS release version, installed from scratch.
Have to reboot to clear it.

X11SSH / 32gb ECC RAM / E3-1260L v5 / 8 x 2tb SSD RAIDZ2 / 2 x 256tb Mirror / H220 reflashed to IT Mode
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
Did this get resolved?
I‘m getting 100% CPU in smbd on TrueNAS release version, installed from scratch.
Have to reboot to clear it.

X11SSH / 32gb ECC RAM / E3-1260L v5 / 8 x 2tb SSD RAIDZ2 / 2 x 256tb Mirror / H220 reflashed to IT Mode
Can you PM me a debug? Above case IIRC was a configuration issue in samba.
 
Top