samba: prohibit filenames containing combining diaeresis / mangle to utf-8

Status
Not open for further replies.

Metis IT

Dabbler
Joined
Oct 10, 2016
Messages
11
i stumbled across the problem that some files stored on my NAS were visible but not readable/writeable from MACs, while windows clients worked fine. further investigation showed that the affected files' filenames contained diacritic characters instead of the corresponding utf-8 character (e.g. the file name contains sequence "\u0308" instead of the corresponding "ö" in utf-8 encoding.).

i know i can convert the file names to UTF-8 using
Code:
convmv -f utf-8 -t utf-8 --nfc -r .

but this is only a temporary solution for files already on the NAS.

e.g. even a windows client will possibly save a file using a filename with combining diaeresis, if it is just saving a received email attachment which contained those characters in the filename in the first place.

what i am looking for is a way to deny those characters to reach the server file system.
Code:
mangled names = illegal

does not help, as windows handles those chars fine, they are not filtered, they reach the server.

i stumbled across the vfs_catia which offers translating illegal characters in filenames, e.g.
Code:
vfs objects = catia
catia:mappings = 0x22:0xa8

...but i would need to replace two bytes and not just one?

or, is there a way to make MAC able to handle those filenames? (10.12 and 10.14 tested, not working)

help would be appreciated ;)
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
9,553
The catia mappings that VFS fruit will apply in "native" mode are as follows:
Code:
0x01:0xf001,0x02:0xf002,0x03:0xf003,0x04:0xf004,
0x05:0xf005,0x06:0xf006,0x07:0xf007,0x08:0xf008,
0x09:0xf009,0x0a:0xf00a,0x0b:0xf00b,0x0c:0xf00c,
0x0d:0xf00d,0x0e:0xf00e,0x0f:0xf00f,0x10:0xf010,
0x11:0xf011,0x12:0xf012,0x13:0xf013,0x14:0xf014,
0x15:0xf015,0x16:0xf016,0x17:0xf017,0x18:0xf018,
0x19:0xf019,0x1a:0xf01a,0x1b:0xf01b,0x1c:0xf01c,
0x1d:0xf01d,0x1e:0xf01e,0x1f:0xf01f,
0x22:0xf020,0x2a:0xf021,0x3a:0xf022,0x3c:0xf023,
0x3e:0xf024,0x3f:0xf025,0x5c:0xf026,0x7c:0xf027,
0x0d:0xf00d


The UTF-8 version of that character looks like it has a hex value of 0x88, the UTF-16 version looks like it has 0x0308. You can add to the list of Catia mappings and see if it gets you what you want (you will need to test writing _new_ files with that character to the server). But perhaps I'm not understanding. Can you give a concrete example of a filename?
 
Status
Not open for further replies.
Top