Introducing cloneacl.py

Status
Not open for further replies.

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
I got annoyed that I didn't have a tool that could clone an ACL on a directory and push it down to all subdirectories / files and apply the proper inheritance flags.... So I wrote a script to do it for me. I only did minimal testing (I wrote it this morning before going to work)... so definitely try it out on your production data. :D

https://github.com/anodos325/cloneacl.py/blob/master/cloneacl.py

Feel free to criticize it. It's basically the first real python script I've ever written.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Work must've sucked, judging by the amount of drinking that would have been required to arrive at a title like "So I made a wrote my a python script".
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I dunno, the drunk title was more readable than Jar Jar-speak. And certainly less infuriating. Unless Jar Jar is Snoke, in which case I will stand up in the middle of the movie and yell "I'm done" and throw my popcorn and drink at the screen.
 

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
Now that I think about it, this is a significant portion of what's needed for a barebones permissions manager (set them for a share and propagate them down - that's all that's needed to significantly improve the usability of FreeNAS with SMB and non-Windows clients).
 

Stux

MVP
Joined
Jun 2, 2016
Messages
4,419
Should add script to your sig ;)
 

gpsguy

Active Member
Joined
Jan 22, 2012
Messages
4,472
Maybe we need a mod from iXsystems to rewrite the subject line. ;-)
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Now that I think about it, this is a significant portion of what's needed for a barebones permissions manager (set them for a share and propagate them down - that's all that's needed to significantly improve the usability of FreeNAS with SMB and non-Windows clients).

That is more or less what I had in mind. Basically, use setfacl to fine-tune permissions at the root of a share, then slap it with cloneacl.py.

The script lacks the middleware goo to fit into the FreeNAS UI. FreeNAS already has a couple of scripts and a c program to handle permissions, and so it makes more sense to add a 'clone' function to them and do a pull request. Big picture, I was wanting to write something super-simple (no need to compile. no need to pip-install crap.) that someone can just download and immediately use to fix ACLs.

The features that this script has that I couldn't find in other things: (the itch I was scratching)
  • It removes existing ACLs setfacl -b before merging the new ACL. This has the same effect as removing explicit permissions and leaving only inherited ones in the Windows world. This is required because the "Inherited" flag (I) wasn't introduced until around FreeBSD 11, and I haven't seen any ACL setting tools that actually respect it.
  • It throws the entire ACL at the file / folder in a single setfacl command rather than iterating through the ACEs and executing multiple setfacl commands. Consider where ACL consists of:
Code:
owner@:full_set:fd:allow
group@:full_set:fd:allow
g:smb_users:modify_set:fd:allow
g:smb_guests:read_set:fd:allow

If you do something like
Code:
for ace in ACL:
	setfacl -m ace Dir

then you have to execute four separate setfacl commands when you could do it all in one go setfacl -m owner@:full_set:fd:allow,group@:full_set:fd:allow,g:smb_users:modify_set:fd:allow,g:smb_guests:read_set:fd:allow dir This basically pawns off the iteration to setfacl, which is good because it should be much faster than a "for loop" in python.

I'll do some performance testing today. Ultimately, that's the point where these sorts of scripts fall down. I prefer to avoid ctypes / interacting directly with posix1e libraries. It feels too much like re-implementing setfacl.

It'd be nice to eliminate some of the loops in the code. For instance:
Code:
for (dirpath, dirnames, filenames) in os.walk(path):
	for name in dirnames:
		folder_path = os.path.join(path, name)
		nuke_acl = subprocess.Popen([SETFACL_PATH,'-b',folder_path])
		clone_acl = subprocess.Popen([SETFACL_PATH,'-m',d_acl,folder_path])
	for name in filenames:
		file_path = os.path.join(path, name)
		nuke_acl = subprocess.Popen([SETFACL_PATH,'-b',file_path])
		clone_acl = subprocess.Popen([SETFACL_PATH,'-m',f_acl,file_path])

In the CLI, you can run the setfacl command against multiple files simultaneously setfacl -m owner@:full_set:fd:allow file1 file2 file3. I had problems getting subprocess.Popen to take a single large string of paths ('/mnt/Tank/file1' '/mnt/Tank/file2' '/mnt/Tank/file3' ) as an argument. Perhaps this is a good thing. I'm not sure what would happen if you tried to pass a 500,000 character string as an argument, but my hunch is vomiting code.

What I'll probably end up doing today or tomorrow is turning the second layer of "for" loops into distinct functions then use python's multiprocessing interface to spawn off separate processes for both of them. This should improve performance.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I'm not sure what would happen if you tried to pass a 500,000 character string as an argument, but my hunch is vomiting code.
It should really be no different from passing a pointer to a really large object, but the string manipulation functions may not be prepared to handle such long strings.

What I'll probably end up doing today or tomorrow is turning the second layer of "for" loops into distinct functions then use python's multiprocessing interface to spawn off separate processes for both of them. This should improve performance.
You'd be surprised. Last time I checked, every python interpreter running shares a global lock (possibly per thread, possibly per system), making the whole thing very inefficient.

In any case, I would definitely not try to spawn one process per folder. Not even one thread per folder. That should not cause a panic, but the kernel will not like what you're doing and things will fail (assuming large shares with many folders).
A better approach would be to spawn a small amount of worker threads at the top level and have each of them iterate down.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
It should really be no different from passing a pointer to a really large object, but the string manipulation functions may not be prepared to handle such long strings.


You'd be surprised. Last time I checked, every python interpreter running shares a global lock (possibly per thread, possibly per system), making the whole thing very inefficient.

In any case, I would definitely not try to spawn one process per folder. Not even one thread per folder. That should not cause a panic, but the kernel will not like what you're doing and things will fail (assuming large shares with many folders).
A better approach would be to spawn a small amount of worker threads at the top level and have each of them iterate down.

Yeah. The script is rather slow right now. I don't think that multiprocessing is going to help it (at least to the extent that it needs help). I created a few test directories with a bunch of files (150000 files in total). The script took about 2.5 minutes to complete. winacl -a reset -r took 3 seconds.

I think I might need to revisit the topic of throwing the whole list of file / dirs at single setfacl commands.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
And it looks like I fixed that particular issue, hit the string limit. :D

And I'm done. Same data set now finishes in 10 seconds. @Ericloewe, I guess multithreading isn't magic pixie dust you can sprinkle on code to make it go faster. :/ The correct answer was to break the directory and file listing into manageable chunks and then run very few setfacl commands. It's still slower than winacl (10 sec vs 4 sec), but not by orders of magnitude. It's easier to use and gives more useful output, and so I'll consider this a win.
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
And I just realized that I was testing on FreeBSD & python27. Not FreeNAS & python3. :oops:

So. On Freenas, you'll have to run python2.7 cloneacl.py TEST. Output of getfacl command is different within python3 and will require more parsing / fixing.
 
Last edited:

Ericloewe

Server Wrangler
Moderator
Joined
Feb 15, 2014
Messages
20,194
I guess multithreading isn't magic pixie dust you can sprinkle on code to make it go faster. :/
It's more like advanced magic that needs to be learned and studied but still doesn't make things magically faster, just somewhat faster.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
It should really be no different from passing a pointer to a really large object, but the string manipulation functions may not be prepared to handle such long strings.

In some tests with real data, it can easily handle 1,000 files simultaneously, but throws errors on 10,000 files. I was thinking of just having two worker processes. One to run setfacl on the list of files and the other on the list of directories. I'll read up more on multiprocessing / multithreading.

Ultimately, this is a requirement because I want to make a "--with-calisthenics" flag for the script that will show a stick figure doing exercises on the screen while the script is running.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
Some stats comparing winacl to cloneacl
Code:
root@TESTERERER:/mnt # time python cloneacl-args.py DonkeyFarts/
5.208u 36.240s 0:10.84 382.2%   18+168k 4+0io 0pf+0w
root@TESTERERER:/mnt # time winacl -a reset -r -p DonkeyFarts/
2.510u 8.995s 0:11.58 99.3%	 15+168k 4+0io 0pf+0w
root@TESTERERER:/mnt # find DonkeyFarts/ | wc -l
  200044

Not surprisingly, the c program is much more efficient. Time to start looking at its source to see if I can just add a 'clone' flag.
 
Status
Not open for further replies.
Top