Scale ACL + Samba is just flat out broken?

guemi

Dabbler
Joined
Apr 16, 2020
Messages
48
Hello,


I recently reinstalled my home NAS from Core to SCALE as we're in the process of switching to SCALE at work for our storage systems since Linux has better compability with our hardware.

I run a very simple setup at home. Two samba shares, one needs to be fairly locked down because I use a sync app on my phone to instantly upload any pictures I take via samba to my NAS as a backup.

It's worked for years and it's been smooth. Not so much with SCALE.


I installed Scale, imported the Pools, the datasets for previous shares are there - so far so good.


Create a new local user on the NAS simply called "cloudsvc".
Enabled it for Samba Authentication.
Create a SMB share, left S-1-1-0 at full control since I'll be doing permissions on the dataset level - just as I did in CORE.
The folder shared is /mnt/STRIPED01/NETWORKSHARES/Cloud and the NETWORKSHARES dataset ACL looks like the following:

1657451254685.png


Go to windows, emty credential manager, delete all previous network shares with net use command for everything to be as good as new:

1657451445874.png


1657451453646.png


Try to go to \\10.10.10.100 - gets prompted for username / password - enter cloudsvc and password:

1657451507480.png

Network share is visible, and the samba log shows the auth and confirms it's OK and that the local INTERSTELLAR (NAS HOSTNAME) user was used:


{"timestamp": "2022-07-10T13:11:40.246925+0200", "type": "Authentication", "Authentication": {"version": {"major": 1, "minor": 2}, "eventId": 4624, "logonId": "0", "logonType": 3, "status": "NT_STATUS_OK", "localAddress": "ipv4:10.10.10.100:445", "remoteAddress": "ipv4:10.10.10.101:64919", "serviceDescription": "SMB2", "authDescription": null, "clientDomain": ".", "clientAccount": "cloudsvc", "workstation": "TITAN", "becameAccount": "cloudsvc", "becameDomain": "INTERSTELLAR", "becameSid": "S-1-5-21-3666039284-1560195053-518253317-20066", "mappedAccount": "cloudsvc", "mappedDomain": ".", "netlogonComputer": null, "netlogonTrustAccount": null, "netlogonNegotiateFlags": "0x00000000", "netlogonSecureChannelType": 0, "netlogonTrustAccountSid": null, "passwordType": "NTLMv2", "duration": 4763}}

Try to enter the network share aaaaaand boom:
1657451578069.png




But if we added @Everyone with full permissions on the dataset?

1657451709530.png



Then it works great. And of course it does, it's never checking any permissions.

getfacl confirms the user has access and is even OWNED by the user I am trying to connect with:
root@INTERSTELLAR[~]# getfacl /mnt/STRIPED01/NETWORKSHARES
getfacl: Removing leading '/' from absolute path names
# file: mnt/STRIPED01/NETWORKSHARES
# owner: cloudsvc
# group: root
user::rwx
group::rwx
other::rwx

root@INTERSTELLAR[~]# ls -lh /mnt/STRIPED01/NETWORKSHARES
total 8.5K
drwxrwxrwx 12 cloudsvc root 17 May 7 12:05 CLOUD


This is the exact same process I've done on multiple CORE installs and it just works - Is SCALE ACL's just not ready to be used with Samba properly?



P.S

When going through SMB and hitting "View Filesystem ACL" in the UI, I get to /mnt/STRIPED01/NETWORKSHARES/CLOUD which is the actual folder that's being shared - but I made sure to mirror the settings of the NETWORKSHARES ACL on to that one, so there isn't a conflict there either.

I also tried directly sharing the path /mnt/STRIPED01/NETWORKSHARES with SMB just in case - but no dice.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
getfacl shows POSIX1E ACLs nfs4xdr_getfacl shows NFSv4-style ACLs. Does your share have auxiliary parameters in its configuration? Some users have had FreeBSD-specific parameters hard-coded that end up preventing share access on Linux.
 

guemi

Dabbler
Joined
Apr 16, 2020
Messages
48
getfacl shows POSIX1E ACLs nfs4xdr_getfacl shows NFSv4-style ACLs. Does your share have auxiliary parameters in its configuration? Some users have had FreeBSD-specific parameters hard-coded that end up preventing share access on Linux.

I've never entered any in FreeBSD or in Linux so shouldn't be any, nope.

I had different usernames in FreeBSD but that's about it.

I'll try stripping all ACL's and redoing them - maybe something carried over from FreeBSD Install on the pools and dataset?
 

guemi

Dabbler
Joined
Apr 16, 2020
Messages
48
No dice in redoing them, so I figured OK let's just make a new dataset and try - and when trying to set ACL there it's also won't let you progress because:

[EINVAL] filesystem_acl.dacl: Presence of [USER_OBJ] entry is required.
[EINVAL] filesystem_acl.dacl: Presence of [GROUP_OBJ] entry is required.
[EINVAL] filesystem_acl.dacl: Presence of [OTHER] entry is required.


All 3 which are clearly there.

1657458170924.png



I keep googling and find forum threads were you anodos is talking about a UI rewrite from around spring/summer 2021 for 21.08. I'm running 22.02.2.1 - has this UI rewrite not happened before?
 
Last edited:

guemi

Dabbler
Joined
Apr 16, 2020
Messages
48
Also tried using POSIX RESTRICTED preset ACL, and just add the user to it, and then share that dataset:

But still no permission. There seems to be massive issues with permissions in SCALE for sure.

Which isn't end of all worlds in my private use, and the benefits of other additions to scale over core certainly out weighs this little problem since I can just temporarily throw a VPN on my phone instead of accessing samba over the internet like previously - but my hopes were to migrate our 4 storage machines at work to SCALE and this just isn't nowhere near business ready, adding the mess that is Active Directory to this would probably cause even more trouble.


Which also isn't exactly a deal breaker, but we've had some trouble with AQC107 10 gbit/s nics on CORE and I do like trying out and using the latest and greatest and I really like the idea behind SCALE.

1657458687111.png
 
Last edited:

guemi

Dabbler
Joined
Apr 16, 2020
Messages
48
Further more, tried to limit on SMB Share access level.

Allow file-system permissions for everyone to 777 and then removed S-1-1-0 from the Share Permissions and tried to add INTERSTELLAR\cloudsvc to full permissions.


Which authenticates correctly according to samba:
{"timestamp": "2022-07-10T15:39:10.537669+0200", "type": "Authentication", "Authentication": {"version": {"major": 1, "minor": 2}, "eventId": 4624, "logonId": "0", "logonType": 3, "status": "NT_STATUS_OK", "localAddress": "ipv4:10.10.10.100:445", "remoteAddress": "ipv4:10.10.10.101:56339", "serviceDescription": "SMB2", "authDescription": null, "clientDomain": "TITAN", "clientAccount": "cloudsvc", "workstation": "TITAN", "becameAccount": "cloudsvc", "becameDomain": "INTERSTELLAR", "becameSid": "S-1-5-21-3666039284-1560195053-518253317-20066", "mappedAccount": "cloudsvc", "mappedDomain": "TITAN", "netlogonComputer": null, "netlogonTrustAccount": null, "netlogonNegotiateFlags": "0x00000000", "netlogonSecureChannelType": 0, "netlogonTrustAccountSid": null, "passwordType": "NTLMv2", "duration": 2634}}



But still does not grant access to the share.

Tried entering the wrong password just to make sure it actually authenticated the local TrueNAS uiser, and yup - log then tells me wrong password:

{"timestamp": "2022-07-10T15:39:14.280395+0200", "type": "Authentication", "Authentication": {"version": {"major": 1, "minor": 2}, "eventId": 4625, "logonId": "0", "logonType": 3, "status": "NT_STATUS_WRONG_PASSWORD", "localAddress": "ipv4:10.10.10.100:445", "remoteAddress": "ipv4:10.10.10.101:56339", "serviceDescription": "SMB2", "authDescription": null, "clientDomain": "TITAN", "clientAccount": "cloudsvc", "workstation": "TITAN", "becameAccount": null, "becameDomain": null, "becameSid": null, "mappedAccount": "cloudsvc", "mappedDomain": "TITAN", "netlogonComputer": null, "netlogonTrustAccount": null, "netlogonNegotiateFlags": "0x00000000", "netlogonSecureChannelType": 0, "netlogonTrustAccountSid": null, "passwordType": "NTLMv2", "duration": 2398}}


If I throw S-1-1-0 ALLOWED back to the SHARE ACL and not touching the dataset permissions as they're already set to allow everyone, I can then access the share with the cloudsvc accound.


Very very strange.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Were you on an old CORE version?

There been a move away from allowing root as an SMB user. Can you remove root user and group and see if the problems persist.

 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
POSIX1E ACL is incorrect above. It grants de-facto 770 permissions with root:root on all paths below the root. No one other than root will be able to access files. POSIX1E acltype only exists in SCALE so this dataset didn't come from core. If you want to use POSIX1E acltype, then you should probably add a default explicit group entry for the default SMB group builtin_users to the path and apply recursively.

If you are using NFSv4 acltype (the type default in FreeBSD), then you can set an inheriting GROUP entry for builtin_users to do the same. One difference between Linux and FreeBSD is whether gid for a new file or dir is inherited from parent dir or set based on gid of creating process. Former is FreeBSD behavior and later is Linux behavior.

In next 22.02 release I have changed ZFS so that in case of NFSv4 ACLs we keep the FreeBSD behavior (since this will ensure permissions behavior are consistent across both platforms when NFSv4 ACLs are selected) to ease migrations for users who are not taking advantage of flexibility of ZFS ACLs to set explicit group permissions rather than relying on OS implementation details for permission (via GROUP@ entries).
 

emsicz

Explorer
Joined
Aug 12, 2021
Messages
78
I'm trying to figure this out myself right now. I don't understand how this was even released, I can't get anything to work. As long as I keep using the traditional ACL, it works as expected. As soon as I try to define users and groups in non-traditional ACL, I can never get past the error dialog that keeps telling me some groups need to be added or whatever. The tooltips are useless:

2022-07-11_201323.png
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
I'm trying to figure this out myself right now. I don't understand how this was even released, I can't get anything to work. As long as I keep using the traditional ACL, it works as expected. As soon as I try to define users and groups in non-traditional ACL, I can never get past the error dialog that keeps telling me some groups need to be added or whatever. The tooltips are useless:

View attachment 56783
I work on backend details, not the webui. I'll file a bug ticket about the webui team and see if they can improve this.

Big picture with POSIX (not NFSv4 ACLs), you have two separate lists:
1. ACCESS - applies to current directory / file
2. DEFAULT - applies to newly created files / directories in the current directory.

And there are six different qualifiers that each entry can have:
1. USER_OBJ (the owner of the file)
2. USER (a separate user)
3. GROUP_OBJ (the group of file)
4. GROUP (a separate group)
5. OTHER (anyone who isn't owner, member of group, or in another ACL entry)
6. MASK (sets the maximum permissions for additional group specified in the ACL)

From the standpoint of what you see in the GUI, the ACCESS acl gives you the permissions for the current path (dataset mountpoint). The DEFAULT ACL tells you what any existing file or newly created file will get. Basically you need to have entries for both. There are also mandatory entries in a POSIX1E ACL:

1. USER_OBJ
2. GROUP_OBJ
3. OTHER

These must exist in ACL and there may only be one ACCESS and one DEFAULT entry for each other them. The backend also requires that a MASK be defined explicitly (because we don't want implicit permissions behavior).

Yes, this is quite complex. I think the NFSv4 ACL option is significantly simpler in that there's one list and you choose explicitly whether you want particular entries to inherit.

You can change acltype in the storage plugin.

That said, this is the way that POSIX1E ACLs work in general on Linux, and since we receive data rsynced from Linux servers (and potentially ZFS replication from other servers using ZoL), we have to support them in the webui.
 

emsicz

Explorer
Joined
Aug 12, 2021
Messages
78
I work on backend details, not the webui. I'll file a bug ticket about the webui team and see if they can improve this.

Big picture with POSIX (not NFSv4 ACLs), you have two separate lists:
1. ACCESS - applies to current directory / file
2. DEFAULT - applies to newly created files / directories in the current directory.

And there are six different qualifiers that each entry can have:
1. USER_OBJ (the owner of the file)
2. USER (a separate user)
3. GROUP_OBJ (the group of file)
4. GROUP (a separate group)
5. OTHER (anyone who isn't owner, member of group, or in another ACL entry)
6. MASK (sets the maximum permissions for additional group specified in the ACL)

From the standpoint of what you see in the GUI, the ACCESS acl gives you the permissions for the current path (dataset mountpoint). The DEFAULT ACL tells you what any existing file or newly created file will get. Basically you need to have entries for both. There are also mandatory entries in a POSIX1E ACL:

1. USER_OBJ
2. GROUP_OBJ
3. OTHER

These must exist in ACL and there may only be one ACCESS and one DEFAULT entry for each other them. The backend also requires that a MASK be defined explicitly (because we don't want implicit permissions behavior).

Yes, this is quite complex. I think the NFSv4 ACL option is significantly simpler in that there's one list and you choose explicitly whether you want particular entries to inherit.

You can change acltype in the storage plugin.

That said, this is the way that POSIX1E ACLs work in general on Linux, and since we receive data rsynced from Linux servers (and potentially ZFS replication from other servers using ZoL), we have to support them in the webui.
Thanks for the explanation, it does give some pointers, but it doesn't really solve anything. It's not that I may have all these entries, it's that I apparently must have these entries in the UI. I can't just contemplate to have an option to include like 9 different ACEs before pressing submit, it's that the UI will give me non human friendly errors if I don't include all of those. That being said, I tried and failed with that approach too.

Also, I still don't understand what the MASK is for. The explanation of "MASK (sets the maximum permissions for additional group specified in the ACL)" tells me nothing. Maybe I'm just dumb, or I'm googling for wrong keywords, but I don't get what "maximum permissions" you're on about for what "additional group". What if I have like 10 groups in there. And how do I even populate all these fields in the UI?

Consider the following model scenario:
  1. TrueNAS root user sets up datasets. Those are then owned by root.
  2. TrueNAS root tries to set up SMB shares in the UI.
This super-basic use case comes with pretty non-basic requirements:
  1. Create individual TrueNAS users for people and apps that are expected to manipulate the data. Already this is getting convoluted because of the weird user/group concept where user "john" also creates a group named "john" which is weird and strange concept to grasp. Some groups and users are pre-created and sometimes are visible and sometimes are not visible. And then there's a group called "wheel" which is somehow normal but I don't understand why is it called wheel. I don't see any wheels in there, nothing is spinning in the UI, none of my new users are in the group wheel (because they are in their own groups called the same as the user itself). None of this makes any sense, so I'm trying to ignore it but it leaves me feeling insecure about my data.
  2. Then I need to set up SMB shares where I need to deal with the fact that SMB has it's own ACLs that are apart from filesystem ACLs and the UI doesn't make this clear at all. The SMB share has a checkbox "Enable ACL" so which ACLs is it referring to and what happens if it's disabled? How does it even make sense to have SMB share with disabled ACL? Although this separation of ACLs is implemented, it is repeatedly recommended around here to not use it and that ACEs should be entered and maintained for the file system ACL only.
To say this simply, it's 23:50 at night where I'm at and I'm stuck trying to create a simple share on TrueNAS for last 6 hours and I don't care about any of this crap. If I wanted this level of granularity, I would've used a pure distro and spent my youth configuring dozens of different plugins and configs and gloat about it on distro forums. The purpose of using TrueNAS is to be shielded from this methinks.
 
Last edited:

guemi

Dabbler
Joined
Apr 16, 2020
Messages
48
Hey guys and thanks for all the replies.

I've been hard at work trialing and error and migrating from CORE to SCALE both home and our 2 work storage systems.

Were you on an old CORE version?

There been a move away from allowing root as an SMB user. Can you remove root user and group and see if the problems persist.


I was on CORE 12.1-RELEASE.

I added cloudsvc group and user as chown/chgrp - but issue still persisted.
I'm not sure why they were owned as root. I don't think that worked in FreeBSD - the reason I had different accounts was because what you linked - I previously used the root account and it stopped working so I created a local user in FreeBSD.
POSIX1E ACL is incorrect above. It grants de-facto 770 permissions with root:root on all paths below the root. No one other than root will be able to access files. POSIX1E acltype only exists in SCALE so this dataset didn't come from core. If you want to use POSIX1E acltype, then you should probably add a default explicit group entry for the default SMB group builtin_users to the path and apply recursively.

If you are using NFSv4 acltype (the type default in FreeBSD), then you can set an inheriting GROUP entry for builtin_users to do the same. One difference between Linux and FreeBSD is whether gid for a new file or dir is inherited from parent dir or set based on gid of creating process. Former is FreeBSD behavior and later is Linux behavior.

In next 22.02 release I have changed ZFS so that in case of NFSv4 ACLs we keep the FreeBSD behavior (since this will ensure permissions behavior are consistent across both platforms when NFSv4 ACLs are selected) to ease migrations for users who are not taking advantage of flexibility of ZFS ACLs to set explicit group permissions rather than relying on OS implementation details for permission (via GROUP@ entries).

Thanks for the elaboration.


FYI I ended up creating 2 new datasets, ticked them for share type "SMB" and reconfigured it.

I still cannot get the SHARE ACL to work with "cloudsvc" but opening it up to S-1-1-0 and limiting the filesystem to "cloudsvc" _DOES_ work.

I then used rsync via the terminal to just sync over the data into the new dataset from the old ones, and deleted the old datasets. I'm sure there's a way to do that in the GUI, but I'm used to linux so it was faster for me this way.


I ended up doing the same at work with our 2 storage systems there.

I'm not gonna make a fool of myself pretending to understand a lot about ZFS and filesystems but having done this migration 3 times now in ~72 hours I've kinda concluded that migrating from CORE to SCALE and importing the ZFS pools and keeping the old datasets / zvols is a _bad way_ of doing it - I'm guessing as you touched that Unix and Linux handles ACL's differently.


So to anyone visting this thread from the future: You're probably easiest off just importing the pool, creating new datasets/zvols and moving the data from your old ones to the new ones and then deleting the old ones.

If you must import pools at all, keeping your files on another system while completely wiping from CORE to SCALE is probably the absolute best way.



anodos / morganL - I still have old FreeBSD datasets if you want me to try something for you if it helps improving the product in any way.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
Hey guys and thanks for all the replies.

I've been hard at work trialing and error and migrating from CORE to SCALE both home and our 2 work storage systems.



I was on CORE 12.1-RELEASE.

I added cloudsvc group and user as chown/chgrp - but issue still persisted.
I'm not sure why they were owned as root. I don't think that worked in FreeBSD - the reason I had different accounts was because what you linked - I previously used the root account and it stopped working so I created a local user in FreeBSD.


Thanks for the elaboration.


FYI I ended up creating 2 new datasets, ticked them for share type "SMB" and reconfigured it.

I still cannot get the SHARE ACL to work with "cloudsvc" but opening it up to S-1-1-0 and limiting the filesystem to "cloudsvc" _DOES_ work.

I then used rsync via the terminal to just sync over the data into the new dataset from the old ones, and deleted the old datasets. I'm sure there's a way to do that in the GUI, but I'm used to linux so it was faster for me this way.


I ended up doing the same at work with our 2 storage systems there.

I'm not gonna make a fool of myself pretending to understand a lot about ZFS and filesystems but having done this migration 3 times now in ~72 hours I've kinda concluded that migrating from CORE to SCALE and importing the ZFS pools and keeping the old datasets / zvols is a _bad way_ of doing it - I'm guessing as you touched that Unix and Linux handles ACL's differently.


So to anyone visting this thread from the future: You're probably easiest off just importing the pool, creating new datasets/zvols and moving the data from your old ones to the new ones and then deleting the old ones.

If you must import pools at all, keeping your files on another system while completely wiping from CORE to SCALE is probably the absolute best way.



anodos / morganL - I still have old FreeBSD datasets if you want me to try something for you if it helps improving the product in any way.

One of key differences between FreeBSD and Linux regarding file creation is that FreeBSD gets now file's group from parent directory. Linux by default gets it from the creating process's GID. I've added a change in ZFS for the NFSv4 acltype (used in FreeBSD) to always do the FreeBSD behavior, but this change isn't in 22.02.2 (will be in 22.02.3). This should fix people who migrate from FreeBSD with permissions that work specifically because of the FreeBSD behavior. So migration issues should be eased somewhat in 22.02.3.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
@anodos Shouldn't a setgid bit on the directory force the BSD behaviour on SysV style OSes?
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,554
@anodos Shouldn't a setgid bit on the directory force the BSD behaviour on SysV style OSes?
Yes, but setting after-the-fact is problematic from a distro maintainer standpoint. Also certain ZFS dataset configurations prevent chmod(2) if an ACL is present. Since NFSv4 ACL support (acltype=nfsv4) is something we added ourselves on Linux, we have some flexibility in how we handle permissions to make it more BSD-like.

 

emsicz

Explorer
Joined
Aug 12, 2021
Messages
78
Just came here to say that switching from POSIX to NFSv4 ACL in dataset advanced settings was pretty painless and works as expected. Courtesy of @NugentS that advised me to try that out. Once I set those up, I could obey by the other advice I received here, which is to set SMB shares open and only use dataset permissions to regulate permissions. With NFSv4 it's painless.
 
Top