File with non-printable characters in name can not be removed

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
Hello everyone,
as the title says i am looking for a solution to a curious problem with two files that are stored on a zfs pool on a TrueNAS system.
See output from shell for the nature of the problem:

Code:
# ls .broken/
256760??m????-452d-a9dc-12c020b9db60.tiff       37763348-03??-413e-92cd-0c245d45dc88.tiff
# ls -i .broken/
ls: 256760��m���-452d-a9dc-12c020b9db60.tiff: No such file or directory
ls: 37763348-03��-413e-92cd-0c245d45dc88.tiff: No such file or directory
# ls -l .broken  2>&1 | base64  
bHM6IDI1Njc2MODObQH+//8tNDUyZC1hOWRjLTEyYzAyMGI5ZGI2MC50aWZmOiBObyBzdWNo
IGZpbGUgb3IgZGlyZWN0b3J5CmxzOiAzNzc2MzM0OC0wM///LTQxM2UtOTJjZC0wYzI0NWQ0
NWRjODgudGlmZjogTm8gc3VjaCBmaWxlIG9yIGRpcmVjdG9yeQp0b3RhbCAwCg==
# rm .broken/*
rm: .broken/256760��m���-452d-a9dc-12c020b9db60.tiff: No such file or directory
rm: .broken/37763348-03��-413e-92cd-0c245d45dc88.tiff: No such file or directory
# cd .broken
# rm 256760$'\340'$'\316'm$'\001'$'\376'$'\377'$'\377'-452d-a9dc-12c020b9db60.tiff
rm: 256760��m���-452d-a9dc-12c020b9db60.tiff: No such file or directory
# rm 37763348-03$'\377'$'\377'-413e-92cd-0c245d45dc88.tiff
rm: 37763348-03��-413e-92cd-0c245d45dc88.tiff: No such file or directory


These files where written to the dataset via a samba share. The client was a windows 10 enterprise machine. The ACL preset was the permissive Windows share ACL.
The error is persisting over reboots.
The only thing that is still possible with these files is moving the parent folder.

I hope someone could give me a hint what could be done about these file. The file content is replaceable but the pool is not, because restoring the whole pool from backups would take a long time.

Cheers,

System:

Version:
TrueNAS-13.0-U1.1 (previously 12.3-U8 and 13.0-U1)

Memory
127.8GiB total available (ECC)

CPU:
Intel(R) Xeon(R) CPU E5-2667 v4 @ 3.20GHz

ZFS version:
zfs-2.1.4-1
zfs-kmod-v2022061500-zfs_dac769ff9

Samba version:
4.15.7_8

Pool:
Code:
  pool: datapool0
 state: ONLINE
  scan: scrub repaired 0B in 1 days 05:57:07 with 0 errors on Mon Aug 15 05:57:08 2022
config:

        NAME                                            STATE     READ WRITE CKSUM
        datapool0                                       ONLINE       0     0     0
          raidz2-0                                      ONLINE       0     0     0
            gptid/230a5eb1-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/244b5a8f-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/21863476-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/236a6163-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/24e327ab-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/23df1de3-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/28498424-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/bc3eb25c-dc0a-11ec-a9ba-0cc47a3918d8  ONLINE       0     0     0
          raidz2-1                                      ONLINE       0     0     0
            gptid/1aaf5289-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/1aaeb3ce-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/1a91efb3-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/1ad31e14-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/1ad3cae7-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/17474b8a-ddd4-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/9dde217e-dffa-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/27b5301c-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
          raidz2-2                                      ONLINE       0     0     0
            gptid/24bcfb36-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/243afad4-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/2654160b-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/26588dc8-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/28f08350-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/01b196f4-b01d-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/4d8b386c-b400-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/2ab54859-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
          raidz2-4                                      ONLINE       0     0     0
            gptid/dc1ccf1c-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/dce6ca13-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/dcbbe69d-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/dd672783-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/dcdd3d62-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/ddac0ecc-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/dd8f26f1-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
            gptid/dd864bc9-fd28-11ec-bfcc-0cc47a3918d8  ONLINE       0     0     0
        logs
          mirror-3                                      ONLINE       0     0     0
            gptid/2a26f950-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
            gptid/2a2dc266-ade4-11ec-ad4b-0cc47a3918d8  ONLINE       0     0     0
        cache
          gptid/2a04da48-ade4-11ec-ad4b-0cc47a3918d8    ONLINE       0     0     0
        spares
          gptid/3b1df2a4-fd40-11ec-bfcc-0cc47a3918d8    AVAIL   
 
Joined
Jun 2, 2019
Messages
591
Welcome!

Have you tried?
Code:
rm -rf .broken
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
Code:
# rm -rf .broken
rm: .broken: Directory not empty

I assume rm -rf also does a rm * first.
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
I wonder if there's a leading character for each file that's not displayed due to an immediately following backspace? Can mc delete these files?
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
1661251889321.png


1661251945820.png


That is the error i get whe i try to delete the files directly. Deleting the folder .broken does not work either but fails silently.
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
Also note that ls can output the name of the files, as long as it is not directly output to a shell. That is why i included the base64 output in the original post.
It does not look like the name begins with a non-printable character but rather there is a character that has no representation at all (Unicode lower surrogate).
I assume these names could still be handled if they are just handled as null terminated byte arrays but that is not very usefull when working with rm and the likes.

I tried listing the files in python as well just to rule out any problems with bash zsh and sh:

Code:
>>> os.listdir('.broken')
['256760\udce0\udccem\x01\udcfe\udcff\udcff-452d-a9dc-12c020b9db60.tiff', '37763348-03\udcff\udcff-413e-92cd-0c245d45dc88.tiff']
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
Did you try the logical next step? os.unlink()?
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
unlink() required a path argument. The same problem pops up. Maybe this is a problem with the way file names are read on zfs?
Code:
>>> os.listdir('.')
['256760\udce0\udccem\x01\udcfe\udcff\udcff-452d-a9dc-12c020b9db60.tiff', '37763348-03\udcff\udcff-413e-92cd-0c245d45dc88.tiff']
>>> os.listdir('.')[0]
'256760\udce0\udccem\x01\udcfe\udcff\udcff-452d-a9dc-12c020b9db60.tiff'
>>> os.unlink(os.listdir('.')[0])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: '256760\udce0\udccem\x01\udcfe\udcff\udcff-452d-a9dc-12c020b9db60.tiff'
 

Samuel Tai

Never underestimate your own stupidity
Moderator
Joined
Apr 24, 2020
Messages
5,399
Is this still shared out via SMB? Can the SMB client delete these files?
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
over smb the files are not visible.
could this mean that the "files" are actually xattrs used by samba?
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
could this possibly be caused by a faulty SAS controller card? we had (possibly unrelated) hardware issues and an unexpected reboot today.

Model of the SAS controller:
LSI SAS9300-8e
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
The C API for simple file operations is not that difficult. I would try to write a small really old school program, i.e. completely locale agnostic, just \0 terminated strings, and see if that works.
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
The C API for simple file operations is not that difficult. I would try to write a small really old school program, i.e. completely locale agnostic, just \0 terminated strings, and see if that works.
Thanks for the suggestion, but does compiling an executable without localization really make a difference to just running rm with LC=C ?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
You need to pass whatever convoluted set of bytes the filename is from the shell to rm - I suspect there is some loss/change here even with LC=C. So if you use readdir to get a null terminated unsigned char *, then pass this unaltered to unlink ...

Remember: anything but \0 and / is allowed in a file name. Including newline, bell, tab, whatever ...
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
that is probably the stuff for another thread: How to stop processes from using "inconvenient" file names :)
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
I tried @Patrick M. Hausen s idea of writing my own program to rename / remove the files in c to circumvent any problems with localisation. It did not work and i am not sure why. Any help in figuring out what went wrong would be apreciated.
Here is the code:

Code:
#include <stdio.h>

int main () {
        int ret;
        char unsigned char oldname[41] =
            {
            0x32, 0x35, 0x36, 0x37, 0x36, 0x30, 0xe0, 0xce,
            0x6d, 0x01, 0xfe, 0xff, 0xff, 0x2d, 0x34, 0x35,
            0x32, 0x64, 0x2d, 0x61, 0x39, 0x64, 0x63, 0x2d,
            0x31, 0x32, 0x63, 0x30, 0x32, 0x30, 0x62, 0x39,
            0x64, 0x62, 0x36, 0x30, 0x2e, 0x74, 0x69, 0x66,
            0x66
            };
        char newname[] = "256760..m....-452d-a9dc-12c020b9db60.tiff";
        ret = rename(oldname, newname);
        if(ret == 0) {
                printf("File renamed");
        } else {
                printf("Error: unable to rename the file");
        }
        return(0);
}


I also tried to compile it with a \0 termination but in both cases it fails. Should i try a dtrace?
Also interesting in my opinion is that the problem seems to come from { 0xfe, 0xff, } which is a utf-16 surrogate, meaning this might be cause by samba after all (smb2+ uses utf-16 to transfer file names.)
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
How did you find out the oldname? My suggestion was to use readdir() to get a binary representation of whatever the system thinks the file is named and then use that for unlink() or unlinkat().

I am not aware there is an unlink* call that works with an inode number or a file pointer, unfortunately.
 

jseifert

Dabbler
Joined
Aug 22, 2022
Messages
16
here is the code i used to get the name of the file:

Code:
#include <stdio.h>
#include <dirent.h>

int main()
{
        DIR *dir;
        struct dirent *dp;
        char * file_name;
        dir = opendir(".");
        while ((dp=readdir(dir)) != NULL) {
                printf("debug: %s\n", dp->d_name);
        }
        closedir(dir);
        return 0;
}


i wrote the whole output to a file with > and then opened the file with okteta to generate the c array from the name.
The result from that is identical to just using ls > filenames.txt so ls is probably also using readdir().
I could try to do away with the printf() in case that also does some stuff but its not like this is a proper string anyway (not using the string class here).
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,776
dp->d_name should be a null terminated string that can be passed to unlink() unaltered.
 
Top