Samba using up most of the RAM

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
Here's a more targeted script. We can do a ustack() on getcwd calls to see if you're landing there:
Code:
#!/usr/sbin/dtrace -s
pid$1::getcwd:entry
{ @[ustack()] = count ();}


It will take the PID of the smbd process as an argument like the other script, and should be less intrusive / less likely to kill the smbd process.
 

michael.samer

Dabbler
Joined
Feb 19, 2018
Messages
21
As the restart every 24h of SMBD doesn't fix the problem and while the "fixes/settings" are already applied I tried 11.0U1/U2 in comparison to 11.1U4 (currently only one user is using the share) applying the same uploads (same VM+different, but identical SAN Storage):
a) FNAS11.1U4:
last pid: 15733; load averages: 0.25, 0.45, 0.52 up 2+17:53:15 09:08:55
49 processes: 1 running, 48 sleeping
CPU: 1.4% user, 0.0% nice, 1.9% system, 0.8% interrupt, 95.8% idle
Mem: 2268M Active, 14G Inact, 455M Laundry, 14G Wired, 648M Free
ARC: 10G Total, 1420M MFU, 8188M MRU, 84M Anon, 80M Header, 767M Other
9196M Compressed, 27G Uncompressed, 3.03:1 Ratio
Swap: 32G Total, 198M Used, 32G Free, 8K In
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
19535 root 1 36 0 54885M 16772M select 5 155:24 23.78% smbd
19555 root 1 21 0 196M 165M select 3 37:37 2.35% smbd
4451 root 12 20 0 169M 29236K nanslp 3 5:26 0.81% collectd
240 root 22 21 0 196M 109M kqread 3 24:49 0.49% python3.6
19427 root 1 20 0 128M 101M select 1 2:31 0.35% smbd
4372 root 19 30 0 54548K 20192K uwait 6 11:59 0.12% consul

b) FNAS11.0U2:
last pid: 29909; load averages: 1.29, 1.46, 1.29 up 5+12:23:12 09:10:07
48 processes: 2 running, 46 sleeping
CPU: 6.8% user, 0.0% nice, 6.7% system, 0.5% interrupt, 86.0% idle
Mem: 5856K Active, 370M Inact, 30G Wired, 718M Free
ARC: 27G Total, 6319M MFU, 20G MRU, 18M Anon, 166M Header, 774M Other
Swap: 31G Total, 282M Used, 31G Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
2421 root 1 98 0 361M 32980K CPU3 3 31.4H 85.82% smbd
4745 root 15 20 0 512M 105M umtxn 11 15:16 16.71% uwsgi
3539 root 1 20 0 279M 4196K select 11 6:04 0.25% smbd
5431 www 1 20 0 55644K 2412K kqread 7 0:02 0.24% nginx

After 3 days running the 11.0 still performs as planned, while the 11.1 crashed it's smbd (and django) two times already and the test died with it. If the test is running flawlessly up to friday I'll switch back to 11.0 as it's samba 3.6.4 works as intended, compared to the 3.7.0 of FN11.1. I wonder why they haven't switched back to the 3.6.x instead of microfixing a leak?!
I'm happy so far to be able to stay with FreeNAS. A export of the 11.1 Settings and import into 11.0 failed as the AD+SMB settings seems to be quite different and not chewable for the GUI, so a manuel import was done but seems to fix the problem :)
 

Chronic

Cadet
Joined
Oct 28, 2014
Messages
9
I'm hoping that U5 has more fixes to this problem than just setting wide links to 'yes' and disabling unix extenstions cuz those two changes have not made my smbd any better. It's still eating RAM until it starts to swap and then when that's filled, dies completely. :(
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
I'm hoping that U5 has more fixes to this problem than just setting wide links to 'yes' and disabling unix extenstions cuz those two changes have not made my smbd any better. It's still eating RAM until it starts to swap and then when that's filled, dies completely. :(

I found the memory leak in samba and patched it. I also sent the patch to samba-technical for review where it was incorporated into Samba 4.7.7. I plan to do some additional testing with U5 before its release. The patch fixes the memory leak that I was able to reproduce.

It was a simple fix. https://github.com/freenas/samba/commit/c66b8b67e828b22195a08194cbc71740e48a1f47
 

michael.samer

Dabbler
Joined
Feb 19, 2018
Messages
21
So today we switched our storages back to the NEW VM with the older 11.0-U2 as 14 days testing under heavy workload worked flawlessly. I've seen as well that the 11.1-U5 is still in waiting, so happy that I checked+switched to the older version instead of waiting.
So my case is closed. NExt time we will have to check the newer versions instead of blind hoping everything works as before.
Cheers
Michael
 

tortue

Dabbler
Joined
Jan 13, 2018
Messages
11
I have 11.1 - U4 and i have de same problem, swap memory leak. Any solution ?

Update to 11.1 U5; 28585 fix for Samba memory leak.
 

Morpheus187

Explorer
Joined
Mar 11, 2016
Messages
61
I've updated our productive system a week ago and the issue went completely away. Thanks to all the people involved, especially anodos for his outstanding work!
 

NAK

Dabbler
Joined
Feb 5, 2020
Messages
16
I don't want to start a new thread. So.. after upgrading to 11.3 I get a new smb memory leak. I seems to happen when I move data from one smb share to another using my Windows client.
Can anyone see such behavior?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
I don't want to start a new thread. So.. after upgrading to 11.3 I get a new smb memory leak. I seems to happen when I move data from one smb share to another using my Windows client.
Can anyone see such behavior?
Can you describe the memory leak more specifically? Perhaps with ps or top output? Maybe PM me a debug.
 
Last edited:

NAK

Dabbler
Joined
Feb 5, 2020
Messages
16
Will do, when I am home again. What further information would you like to get? Setup/pool layout / anything else?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
Will do, when I am home again. What further information would you like to get? Setup/pool layout / anything else?
Exact steps to reproduce the issue, estimation of how quickly resident memory for the smbd process is growing, description of contents of directory being copied, and anything else you can think of.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
Will do, when I am home again. What further information would you like to get? Setup/pool layout / anything else?
It would also be helpful if you could also give me talloc pool usage output for the smbd process. smbcontrol <pid> pool-usage > /var/log/samba4/smbd.pool_usage.out.
 

NAK

Dabbler
Joined
Feb 5, 2020
Messages
16
My fault. It was some cp that filled the service bar. Thanks for your help so far!
 

fenixaz

Cadet
Joined
Jun 17, 2019
Messages
2
After upgrading from 11.3-U4.1 to 12.0, a memory leak started on Samba that takes up the entire swap. This is expressed in the following,
there are multiple shares for IIS (shared configuration, web farm) memory consumption increases over time.


# ps aux | grep smbd
DOMAIN\user 1850 1.0 19.4 5387576 2429756 - S 16Nov20 221:17.37 /usr/local/sbin/smbd --daemon
root 1603 0.0 0.9 177980 107984 - Ss 16Nov20 0:29.03 /usr/local/sbin/smbd --daemon
root 1681 0.0 0.8 177520 105780 - S 16Nov20 0:46.15 /usr/local/sbin/smbd --daemon
root 1687 0.0 0.8 174420 105404 - S 16Nov20 0:06.99 /usr/local/sbin/smbd --daemon
root 1856 0.0 1.2 238544 144600 - S 16Nov20 1:10.66 /usr/local/sbin/smbd --daemon
DOMAIN\user 1898 0.0 18.3 5064212 2291088 - S 16Nov20 138:25.44 /usr/local/sbin/smbd --daemon
DOMAIN\user 1899 0.0 17.1 4782608 2149152 - S 16Nov20 131:33.50 /usr/local/sbin/smbd --daemon
...

# top -oswap -w
last pid: 55349; load averages: 0.55, 0.51, 0.48 up 9+03:22:48 11:37:28
116 processes: 1 running, 115 sleeping
CPU: 2.1% user, 0.0% nice, 1.6% system, 0.0% interrupt, 96.4% idle
Mem: 586M Active, 586M Inact, 6526M Laundry, 3944M Wired, 259M Free
ARC: 1305M Total, 144M MFU, 557M MRU, 20M Anon, 24M Header, 560M Other
161M Compressed, 632M Uncompressed, 3.92:1 Ratio
Swap: 14G Total, 8634M Used, 5702M Free, 60% Inuse

PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
1647 root 1 52 0 35M 0B 6252K pause 0 0:00 0.00% <nginx>
1330 root 1 52 0 19M 0B 5372K wait 2 0:00 0.00% <syslog-ng>
1674 root 11 20 0 98M 32M 0B nanslp 1 65:52 2.57% collectd
1651 root 1 20 0 46M 9644K 0B kqread 2 4:36 1.81% winbindd
1898 root 2 22 0 4951M 2243M 0B kqread 1 138:34 1.71% smbd
15248 root 1 20 0 222M 151M 0B kqread 0 2:51 1.16% smbd
...

For others, share is fine.


# cat /usr/local/etc/smb4.conf
#
# SMB.CONF(5) The configuration file for the Samba suite
# $FreeBSD$
#


[global]
dns proxy = No
aio max threads = 2
max log size = 51200
load printers = No
printing = bsd
disable spoolss = Yes
dos filemode = Yes
kernel change notify = No
directory name cache size = 0
nsupdate command = /usr/local/bin/samba-nsupdate -g
unix charset = UTF-8
log level = 1 auth_json_audit:3@/var/log/samba4/auth_audit.log
obey pam restrictions = True
enable web service discovery = True
logging = syslog@1 file
server min protocol = SMB2_02
unix extensions = No
restrict anonymous = 2
server string = FreeNAS Server
create mask = 0644
directory mask = 0755
bind interfaces only = Yes
netbios name = nas01
netbios aliases =
server role = member server
kerberos method = secrets and keytab
workgroup = <workgroup>
realm = <domain>
security = ADS
local master = No
domain master = No
preferred master = No
winbind cache time = 7200
winbind max domain connections = 10
client ldap sasl wrapping = seal
template shell = /bin/sh
template homedir = /mnt/smb_home/users_home/%D/%U
ads dns update = No
allow trusted domains = Yes
winbind enum users = Yes
winbind enum groups = Yes
idmap config <domain>: backend = rid
idmap config <domain>: range = 20000-90000000
idmap config *: backend = tdb
idmap config *: range = 90000001-100000000
smb2 leases = no
registry shares = yes
include = registry

Problems with these [web_production], [iis_production]

# net conf list
[homes]
path = /mnt/smb_home/users_home/%D/%U
browseable = no
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr ixnas
ixnas:base_user_quota = 150G

[public]
path = /mnt/smb_share/public
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr

[web_production]
path = /mnt/smb_share/web_production
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr

[iis_production]
path = /mnt/smb_share/iis_production
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr
....
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
After upgrading from 11.3-U4.1 to 12.0, a memory leak started on Samba that takes up the entire swap. This is expressed in the following,
there are multiple shares for IIS (shared configuration, web farm) memory consumption increases over time.


# ps aux | grep smbd
DOMAIN\user 1850 1.0 19.4 5387576 2429756 - S 16Nov20 221:17.37 /usr/local/sbin/smbd --daemon
root 1603 0.0 0.9 177980 107984 - Ss 16Nov20 0:29.03 /usr/local/sbin/smbd --daemon
root 1681 0.0 0.8 177520 105780 - S 16Nov20 0:46.15 /usr/local/sbin/smbd --daemon
root 1687 0.0 0.8 174420 105404 - S 16Nov20 0:06.99 /usr/local/sbin/smbd --daemon
root 1856 0.0 1.2 238544 144600 - S 16Nov20 1:10.66 /usr/local/sbin/smbd --daemon
DOMAIN\user 1898 0.0 18.3 5064212 2291088 - S 16Nov20 138:25.44 /usr/local/sbin/smbd --daemon
DOMAIN\user 1899 0.0 17.1 4782608 2149152 - S 16Nov20 131:33.50 /usr/local/sbin/smbd --daemon
...

# top -oswap -w
last pid: 55349; load averages: 0.55, 0.51, 0.48 up 9+03:22:48 11:37:28
116 processes: 1 running, 115 sleeping
CPU: 2.1% user, 0.0% nice, 1.6% system, 0.0% interrupt, 96.4% idle
Mem: 586M Active, 586M Inact, 6526M Laundry, 3944M Wired, 259M Free
ARC: 1305M Total, 144M MFU, 557M MRU, 20M Anon, 24M Header, 560M Other
161M Compressed, 632M Uncompressed, 3.92:1 Ratio
Swap: 14G Total, 8634M Used, 5702M Free, 60% Inuse

PID USERNAME THR PRI NICE SIZE RES SWAP STATE C TIME WCPU COMMAND
1647 root 1 52 0 35M 0B 6252K pause 0 0:00 0.00% <nginx>
1330 root 1 52 0 19M 0B 5372K wait 2 0:00 0.00% <syslog-ng>
1674 root 11 20 0 98M 32M 0B nanslp 1 65:52 2.57% collectd
1651 root 1 20 0 46M 9644K 0B kqread 2 4:36 1.81% winbindd
1898 root 2 22 0 4951M 2243M 0B kqread 1 138:34 1.71% smbd
15248 root 1 20 0 222M 151M 0B kqread 0 2:51 1.16% smbd
...

For others, share is fine.


# cat /usr/local/etc/smb4.conf
#
# SMB.CONF(5) The configuration file for the Samba suite
# $FreeBSD$
#


[global]
dns proxy = No
aio max threads = 2
max log size = 51200
load printers = No
printing = bsd
disable spoolss = Yes
dos filemode = Yes
kernel change notify = No
directory name cache size = 0
nsupdate command = /usr/local/bin/samba-nsupdate -g
unix charset = UTF-8
log level = 1 auth_json_audit:3@/var/log/samba4/auth_audit.log
obey pam restrictions = True
enable web service discovery = True
logging = syslog@1 file
server min protocol = SMB2_02
unix extensions = No
restrict anonymous = 2
server string = FreeNAS Server
create mask = 0644
directory mask = 0755
bind interfaces only = Yes
netbios name = nas01
netbios aliases =
server role = member server
kerberos method = secrets and keytab
workgroup = <workgroup>
realm = <domain>
security = ADS
local master = No
domain master = No
preferred master = No
winbind cache time = 7200
winbind max domain connections = 10
client ldap sasl wrapping = seal
template shell = /bin/sh
template homedir = /mnt/smb_home/users_home/%D/%U
ads dns update = No
allow trusted domains = Yes
winbind enum users = Yes
winbind enum groups = Yes
idmap config <domain>: backend = rid
idmap config <domain>: range = 20000-90000000
idmap config *: backend = tdb
idmap config *: range = 90000001-100000000
smb2 leases = no
registry shares = yes
include = registry

Problems with these [web_production], [iis_production]

# net conf list
[homes]
path = /mnt/smb_home/users_home/%D/%U
browseable = no
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr ixnas
ixnas:base_user_quota = 150G

[public]
path = /mnt/smb_share/public
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr

[web_production]
path = /mnt/smb_share/web_production
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr

[iis_production]
path = /mnt/smb_share/iis_production
access based share enum = yes
read only = no
guest ok = no
nfs4:chown = true
ea support = false
vfs objects = zfs_space zfsacl streams_xattr
....
Can you PM me a debug please? (System->Advanced->Save Debug)
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,545
If this process is still alive, send me its talloc memory usage as well:
1898 root 2 22 0 4951M 2243M 0B kqread 1 138:34 1.71% smbd
smbcontrol 1898 pool-usage
 

fenixaz

Cadet
Joined
Jun 17, 2019
Messages
2
After many such errors in the log file. Samba restarted the processes, partially freeing memory.

2020-11-26T11:50:50.146845+10:00 nas01 smbd 1850 - - [2020/11/26 11:50:50.146812, 1] ../../source3/lib/messages_dgm.c:704(messaging_dgm_out_sent_fragment)
2020-11-26T11:50:50.146875+10:00 nas01 smbd 1850 - - messaging_dgm_out_sent_fragment: messaging_out_queue_recv returned Operation timed out

At that time, for the remaining processes consuming a large amount of memory, smbcontrol pool-usage was empty.
Now the picture is starting to repeat itself. pool-usage sent to PM.
 
Top