Faster Searching from Mac Spotlight or Finder of Freenas Files

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
The same results as @phradr

TrueNAS 13.1-RELEASE-p1 n245406-814eb095751
Samba 4.15.9

ES 7.17.3

I have tried differen crawler but the same results -

FSCrawler 7.2.10

I can see from TrueNAS that ES is working. I test it with:

curl -H 'Content-Type: application/json' -X GET "http://jailIP:9200/test/_search?pretty"

but in /var/log/samba4/log.smbd

Code:
[2022/10/02 17:56:29.298324,  0] ../../libcli/http/http.c:199(http_parse_response_line)
  http_parse_response_line: Error parsing header
[2022/10/02 17:56:29.299070,  0] ../../lib/util/fault.c:172(smb_panic_log)
  ===============================================================
[2022/10/02 17:56:29.299136,  0] ../../lib/util/fault.c:176(smb_panic_log)
  INTERNAL ERROR: Signal 11: Segmentation fault in pid 8094 (4.15.9)
[2022/10/02 17:56:29.299162,  0] ../../lib/util/fault.c:181(smb_panic_log)
  If you are running a recent Samba version, and if you think this problem is not yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting
[2022/10/02 17:56:29.299279,  0] ../../lib/util/fault.c:182(smb_panic_log)
next step I am going to try the same in FreeBSD instance
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
I saw the Samba requests in ES log and executed them via postman. Works like charm. But the http parser within samba cannot handle the response. Maybe because of the warning ES still generates in my setup as I haven‘t secured the installation (tbh: I doubt it, that this is the source of all evil).

I‘ll give this another try with a nginx reverse proxy inbetween. Maybe i then can manipulate the response header, so samba is able to parse it.

Another try would be to debug into the samba server and habe a look at what fails the parsing. But my C skills arw somewhat rusty since I havent used C for about 12 years…

Maybe someone else in this thread can do that?
Has someone seen an API description of that samba functionality (eg.: what kind of response is samba awaiting)?
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
I have also tried on TrueNAS elasticsearch 6.8.16 - the same samba problem: http_parse_response_line: Error parsing header. So indeed it points to samba.

Then I tried all solution on freeBSD 13.1 machine - no samba error. But it uses samba 4.13 so can't be directly compared.

But overall I think we are on good track to make it work.
 

seanm

Guru
Joined
Jun 11, 2018
Messages
570
All this discussion of Spotlight support... is that just to have a Spotlight index with all its rich metadata (and thus allowing advanced searches)?

Should a basic Finder search by filename work with TrueNAS Core 13.0-U2 without any of this Elastic search stuff? (I have users saying such searches fail, and the failure seems to have started around when we updated from 12 to 13.)
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
Hi seanm,

yes and no.

"The Spotlight" support accelerates searching for files (filenames AND content) in a range that there is no alternative to it.
But, and that is, why we try to enable all this, Spotlight needs to build an index on its own for files. This is not possible with samba (for obvious reasons, as you would download everything from a NAS just to build an index).

So within Samba there was the Spotlight support added. This uses an outsourced service to serve the index (ElasticSearch, very powerful searchengine). Together with an index relevant data reading service (fscrawler, for ex.) paired with ocr functionality everything would be set up to use this.

Sadly the API between Samba and ES is broken and we can't easily debug the errors nor could we patch it or build a workaround as the functionality of the Samba API is not documented (at least I found none yet).

The basic filename search would work with samba, but on bigger storages that would be a pain in the a..
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
All this discussion of Spotlight support... is that just to have a Spotlight index with all its rich metadata (and thus allowing advanced searches)?

Should a basic Finder search by filename work with TrueNAS Core 13.0-U2 without any of this Elastic search stuff? (I have users saying such searches fail, and the failure seems to have started around when we updated from 12 to 13.)

elasticsearch is database supported by Samba - so client can do search over Samba connection to share

what is in elasticsearch is different story. you use different programs to feed it (e.g. fscrawler) - it can be as simple as only files names or as complex as OCR of images.

Problem is that Samba implementation in TrueNAS seems to not work with elasticsearch.
 

seanm

Guru
Joined
Jun 11, 2018
Messages
570
Thanks both for your replies. I understand what you are describing, and indeed it would be great to have that working!

The basic filename search would work with samba, but on bigger storages that would be a pain in the a..
Basic filename search was working for us. Now it is not working from macOS, but is working from Windows. Not sure when/why this happened, but we updated from TrueNAS Core 12.0-U8.1 to 13.0-U2 a week ago, so that's my first suspect.

Does basic filename search from macOS Finder work for others?
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
Thanks both for your replies. I understand what you are describing, and indeed it would be great to have that working!


Basic filename search was working for us. Now it is not working from macOS, but is working from Windows. Not sure when/why this happened, but we updated from TrueNAS Core 12.0-U8.1 to 13.0-U2 a week ago, so that's my first suspect.

Does basic filename search from macOS Finder work for others?
I guess it will work if you disable the Spotlight support for the share as Samba will stop to utilize the ES-API.

Windows users will not recognize any change, as windows doesn't make use of this search support anyways.
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
Thanks both for your replies. I understand what you are describing, and indeed it would be great to have that working!


Basic filename search was working for us. Now it is not working from macOS, but is working from Windows. Not sure when/why this happened, but we updated from TrueNAS Core 12.0-U8.1 to 13.0-U2 a week ago, so that's my first suspect.

Does basic filename search from macOS Finder work for others?

Yes. not out of the box though. But e.g. myself I am only interested in full scale content Spotlight search.
 

seanm

Guru
Joined
Jun 11, 2018
Messages
570
I guess it will work if you disable the Spotlight support for the share

How does one do that? I don't see any UI when I edit the share. Also, enabling Spotlight support is not something I ever did. Does the 12 to 13 upgrade enable this on all existing shares?
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
How does one do that? I don't see any UI when I edit the share. Also, enabling Spotlight support is not something I ever did. Does the 12 to 13 upgrade enable this on all existing shares?
If you don't know how Spotlight support in Samba is enabled, then you probably have not enabled it.

[global] spotlight backend = elasticsearch elasticsearch:address = localhost elasticsearch:port = 9200 [share] spotlight = yes

Here you'll find what that means (LINK). :wink:
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
To verify that all elasticsearch/samba solution works I have tried quick proof of concept not on trueNAS but on fresh Debian 11 with default Samba 4.13.13

I have installed es7 - run it with all defaults - no extra config:


and this fscrawler with basic configuration pointing to my shared folder:


for fscrawler I had to add:

apt install openjdk-17-jdk

and export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64

works beautifully - I can use Spotlight on my mac to find documents by content (MS Office, Libre Office, pdf etc.)

So good news is that it is doable in general.

Short term I am thinking to spin VM with Debian11 with samba on different than default port. This way I could have "searchable" share on my mac.

In terms on trueNAS I am not sure how we can proceed - I am not C programmer and unable to debug trueNAS samba code.

@phradr may I suggest as you have more detailed traces of the problem - maybe you could start with creating trueNAS ticket? Everthing points to trueNAS Samba issue. They have enabled spotlight:

Code:
smbd -b | grep SPOT
   HAVE_SPOTLIGHT_BACKEND_ES
   WITH_SPOTLIGHT


but for some reason it struggles to parse es responses
 
Last edited:

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
@phradr may I suggest as you have more detailed traces of the problem - maybe you could start with creating trueNAS ticket?
Of course you may suggest, but the issue is not really TrueNAS related.
It's the Samba source. As you used an 'outdated' version in your POC - at least an older one than integrated in (my used) TrueNAS Comunity Edition - I have no clue how or whom to point to that with the effect having TrueNAS changed/rolled back to the older Samba version within its repository.

The better way would be to open a ticket at Samba. Due to my limited time I can play around with this I did't manage it to reproduce the error with full debugging enabled. Maybe I can do this at the weekend. Then I will open a ticket here (LINK) and point to it from within this thread. That enables you to jump on to this and push it a bit more into focus of the devs.
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
Of course you may suggest, but the issue is not really TrueNAS related.
It's the Samba source. As you used an 'outdated' version in your POC - at least an older one than integrated in (my used) TrueNAS Comunity Edition - I have no clue how or whom to point to that with the effect having TrueNAS changed/rolled back to the older Samba version within its repository.

The better way would be to open a ticket at Samba. Due to my limited time I can play around with this I did't manage it to reproduce the error with full debugging enabled. Maybe I can do this at the weekend. Then I will open a ticket here (LINK) and point to it from within this thread. That enables you to jump on to this and push it a bit more into focus of the devs.

kk - will try this weekend (on Debian) to compile the same Samba version as in trueNAS to see if this is samba version problem then decide how to proceed.
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
kk - will try this weekend (on Debian) to compile the same Samba version as in trueNAS to see if this is samba version problem then decide how to proceed.
That's a great idea! I'm looking forward to the results
 

seanm

Guru
Joined
Jun 11, 2018
Messages
570
I have created a fresh VM with a vanilla install of TrueNAS Core 12.0-U8.1, created a single user, single pool, single dataset, and single share on it. Very vanilla setup. I connected to it with Finder, added a few files, and tested searching by filename. It works. I updated this test VM to Core 13.0-U2, and now search by filename does not work. I rolled the VM back to 12.0-U8.1 and filename search works again.

I think I've simply found a bug. May or may not be related to Spotlight support at all. Filed: https://ixsystems.atlassian.net/browse/NAS-118498
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
I have created a fresh VM with a vanilla install of TrueNAS Core 12.0-U8.1, created a single user, single pool, single dataset, and single share on it. Very vanilla setup. I connected to it with Finder, added a few files, and tested searching by filename. It works. I updated this test VM to Core 13.0-U2, and now search by filename does not work. I rolled the VM back to 12.0-U8.1 and filename search works again.

I think I've simply found a bug. May or may not be related to Spotlight support at all. Filed: https://ixsystems.atlassian.net/browse/NAS-118498

Might be another problem. Samba search can support following backends: noindex, elasticsearch and tracker.

I see that you attempt to use noindex - and indeed it should work but all this thread is about "faster searching":) using elasticsearch backend.
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
I have created a fresh VM with a vanilla install of TrueNAS Core 12.0-U8.1, created a single user, single pool, single dataset, and single share on it. Very vanilla setup. I connected to it with Finder, added a few files, and tested searching by filename. It works. I updated this test VM to Core 13.0-U2, and now search by filename does not work. I rolled the VM back to 12.0-U8.1 and filename search works again.

I think I've simply found a bug. May or may not be related to Spotlight support at all. Filed: https://ixsystems.atlassian.net/browse/NAS-118498
During my experiments with es backend I was checking if other samba defaults are the same as in generic version.

File names search (with VM Core 13) worked for me when I added:

rpc_daemon:mdssd = fork
rpc_server:mdssvc = external

or

rpc_daemon:mdssd = fork
rpc_server:mdssvc = embedded

to SMB service extra parameters.

I have not looked at this too much as this is not what I want:) but can solve your issue.

The second one is probably one you should use - but I have not investigated it further.
 
Last edited:

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
Actually I have now realised that your file names search stopped working when moving from core 12 -> 13 as probably spotlight backend parts interfere with noindex search.

So you can simply disable them

rpc_daemon:mdssd = disabled
rpc_server:mdssvc = disabled

and it will work like before in your case
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
That's a great idea! I'm looking forward to the results

OK could not wait for weekend and did all tests today.

Debian with Samba 4.15.9 compiled from source (the same version TrueNAS is using) works perfectly - so it is not problem with this particular Samba version.

Maybe problem is with FreeBSD? I realised that their pkg Samba (the latest available is 4.13.17) does not have Spotlight enabled so compiled from ports with Spotlight ON. All works perfectly - I can search files by their content. It is not FreeBSD issue.

These results more and more point into problem with TrueNAS Samba - they use their own fork.

Errors sequence we see in TrueNAS smb.log when ES response is received starts with:

Code:
[2022/09/29 15:56:50.745574,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../libcli/http/http.c:199(http_parse_response_line)
  http_parse_response_line: Error parsing header


and this is where it gets real interesting. FreeBSD samba port patches exactly this file - http.c - and exactly in place where this error is thrown:


in vanilla Samba:

Code:
    n = sscanf(line, "%m[^:]: %m[^\r\n]\r\n", &key, &value);
    if (n != 2) {
        DEBUG(0, ("%s: Error parsing header '%s'\n", __func__, line));
        status = HTTP_DATA_CORRUPTED;
        goto error;
    }


but in FreeBSD:

Code:
#ifdef FREEBSD
    int s0, s1, s2, s3; s0 = s1 = s2 = s3 = 0;
    n = sscanf(line, "%n%*[^/]%n/%c.%c %d %n%*[^\r\n]%n\r\n",
           &s0, &s1, &major, &minor, &code, &s2, &s3);

    if(n == 3) {
        protocol = calloc(sizeof(char), s1-s0+1);
        msg = calloc(sizeof(char), s3-s2+1);

        n = sscanf(line, "%[^/]/%c.%c %d %[^\r\n]\r\n",
            protocol, &major, &minor, &code, msg);
    }
#else
     n = sscanf(line, "%m[^/]/%c.%c %d %m[^\r\n]\r\n",
            &protocol, &major, &minor, &code, &msg);
#endif
     if (n != 2) {
         DEBUG(0, ("%s: Error parsing header '%s'\n", __func__, line));
         status = HTTP_DATA_CORRUPTED;
         goto error;
    }


Could not find out TrueNAS Core Samba source code (I only started using TrueNAS and FreeBSD last week and still learning) to see what is there but I think that now we have enough info to create meaningful bug report. For this your traces would be extremely helpful - unfortunately I am not sure how to get them.

Hopefully some TrueNAS dev can look into it and figure out what is wrong with their Samba 4.15.9.

ES search over Samba really makes difference - works very well. Of course then it will be yet another story how to optimise it etc. Already can see that it needs good amount of RAM - I tried with 30k files and used like 5GB.
 
Last edited:
Top