Faster Searching from Mac Spotlight or Finder of Freenas Files

gegtor

Explorer
Joined
Sep 16, 2017
Messages
99
Long story short:
-There are no plans for native Spotlight support (only spotlight backed enabled in latest versions of TrueNAS Core or Scale)
-This outlines what you should do https://wiki.samba.org/index.php/Spotlight_with_Elasticsearch_Backend

1) Prepare a Linux VM for fscrawler + ElasticSearch
2) Follow fscrawler docs and point it to your storage (via SSH for example)
3) Point fscrawler to your ElasticSearch install
4) Let it run and verify thats it's working via ElasticSearch web ui
5) Point your Samba Service under TrueNAS to your ElasticSearch instance
6) Verify with mdfind

Voilà!
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,740
May I recommend using a jail instead of a Linux VM? You will get way better performance of the crawler with a local mount.
 

dennech

Cadet
Joined
May 18, 2022
Messages
3
I tried to set up on 13.0 a jail with ElasticSearch working according to several guides found on the Internet, but unfortunately my knowledge is obviously not enough. I managed to make a working jail with everything I need, but ElasticSearch does not start, throws a bunch of errors in Java.

I would be glad if someone has a working step by step recipe to create Jail or Linux VM with working elk and show how to connect data to it to make Spotlight search work. Thanks!
 

gegtor

Explorer
Joined
Sep 16, 2017
Messages
99
May I recommend using a jail instead of a Linux VM? You will get way better performance of the crawler with a local mount.
FreeBSD Jail would be a pain to set up and most documentation for fscrawler and ElasticSearch is for Linux

My VM running on the same box indexed over 770 000 files in 4 hours using SSH
 

volothamp

Explorer
Joined
Jul 28, 2019
Messages
72
FreeBSD Jail would be a pain to set up and most documentation for fscrawler and ElasticSearch is for Linux

My VM running on the same box indexed over 770 000 files in 4 hours using SSH

What do you mean? Java has been running on FreeBSD for decades and it works perfectly.

FS crawler works without problem on TrueNAS/FreeBSD

ElasticSearch is "just" a webserver running locally
 

dennech

Cadet
Joined
May 18, 2022
Messages
3
Long story short:
-There are no plans for native Spotlight support (only spotlight backed enabled in latest versions of TrueNAS Core or Scale)
-This outlines what you should do https://wiki.samba.org/index.php/Spotlight_with_Elasticsearch_Backend

1) Prepare a Linux VM for fscrawler + ElasticSearch
2) Follow fscrawler docs and point it to your storage (via SSH for example)
3) Point fscrawler to your ElasticSearch install
4) Let it run and verify thats it's working via ElasticSearch web ui
5) Point your Samba Service under TrueNAS to your ElasticSearch instance
6) Verify with mdfind

Voilà!
Hi gegtor,

I was wondering if you could do a step by step instruction on how to set this up? I think a lot of people would appreciate it. Thanks!
 

cap

Contributor
Joined
Mar 17, 2016
Messages
122
I was wondering if you could do a step by step instruction on how to set this up? I think a lot of people would appreciate it. Thanks!
Yes, a tutorial or something "ready to use" would be great.

Personally, I think I would be more interested in Spotlight on TrueNAS Scale.
I myself still use TrueNAS Core 12, but will probably switch to Scale sooner or later, since Linux is probably superior for home users.
With Linux you get a lower power consumption, because you can reach lower c-states, especially if you use Powertop. That can make up a few watts. And that is important to me.
And for home use, there are probably more ready-made Docker images (Photoprism, Jellyfin, etc.), even though I think Jail is a great concept.

But Spotlight would be really great.
 

johnno1320

Cadet
Joined
Aug 30, 2022
Messages
1
Ok, I know this thread is old, but I have attempted to get indexing working on SMB many times now on Core12, Scale and now Core 13.0-U1.1 with little to show for my time. Conceptually it is simple. Create the jail, install Elasticsearch, install FScrawler and set some environment variables for the jail, for the SMB share and finally for the SMB service itself.

So, create jail, no problem. Install Elasticsearch in the jail, no problem and it runs. Install FScrawler, this was confusing, but worked eventually. Had quite a bit of additional configuration required to get it to run without complaints. FSCrawler appears to be a simple script, it isn't a service so it runs manually, but it runs. I even installed Kibana to attempt to validate I had things working, which also runs. It is a nice tool providing a GUI of ES. Using /your-network:5601/status#/ provides a simple dash showing your heap size and other info and all plugins (of with FSCrawler is not) confirming that ES is at least running, however..... no indexing.

I have researched everyone's posts on this subject I could find. Patrick M. Hausen offers a lot of suggestions, and researching them has got me farther than I would have on my own, so thank you Patrick! But my question is has anyone been able to succeed in the goal of indexing for a Mac on SMB, or is it just conceptual to be left to the people at the retail big name NAS company?

dannech, did you get any farther in your attempts? You were stuck on getting ES to run in your jail? Were you able to get past this hangup with Java?

volothamp, I have seen your posts on most of the threads related to this topic. Have you had or seen any success you may be able to direct me towards? How are you handling Mac client indexing currently?

I am just looking for any recommendations or directions. I am thinking of scrapping Core again and attempting Scale one last time as FSCrawler has better documentation for Docker with available Compose samples, but still lacking debugging information to be able to trouble shoot when no index is generated, but I think I am close... I just can't figure out what a next step is from where I am.
 

dennech

Cadet
Joined
May 18, 2022
Messages
3
dannech, did you get any farther in your attempts? You were stuck on getting ES to run in your jail? Were you able to get past this hangup with Java?
Unfortunately, I am experiencing the same problems, but I could not even get Elasticsearch to run normally in Jail. To be honest - I gave up trying. It needs a good FreeBSD expert to do it.

If someone could make a paid feature add-on - I think there would be a lot of people willing to buy it.
 

volothamp

Explorer
Joined
Jul 28, 2019
Messages
72
volothamp, I have seen your posts on most of the threads related to this topic. Have you had or seen any success you may be able to direct me towards? How are you handling Mac client indexing currently?

Hi John, I managed to have the searching running locally on my NAS, but since the SMB support wasn't enabled in Spotlight, I've created a simple macOS GUI to inspect the ElasticsSearch server. Everything works, but I haven't published this app on GitHub yet as there are some secrets inside.

Now that the flag is enabled I should upgrade my TrueNAS installation but I'm out of free time :) I can support you in doing it if you need anything, let me know how
 
Last edited:

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
Hey,

I decided to jump in here.

My goal is to combine Spotlight and OCR for at least one folder I save scanned documents in.

This is how far I got until now and what I did:
  1. created a new Jail (I don't want this to be run on another Linux VM)
    1. added my scan folder to mount-list
  2. via CLI I installed ES from Repo within this jail with success
    1. minimum config done - to know the port ES listens on
    2. put it on autostart
  3. installed fscrawler with curl
    1. downloaded it with curl
    2. unzipped it
    3. ran the script so it created a job for me automatically
    4. edited this script to scan my "scans" folder
    5. edited the default target language
  4. installed tesseract (OCR scanner) from repository
    1. my preferred language (german) was already preinstalled, so nothing more to do here
  5. again ran fscrawler and it started scanning my target folder "scans"
Need to verify this for now, but at least I reached that within about 30 minutes and a bit of googleing.

Report, if you are stuck somewhere, maybe I can help

Best regards,
phradr
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
Until now I only set up kibana for this.

I was struggling with the smb server settings to make use of the elasticsearch engine.
I stopped workin on that when it did not connect at the first try.
The smbd log says it has authentication issues. So I guess there are only some settings missing and after that it is ready to use.
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
Until now I only set up kibana for this.

I was struggling with the smb server settings to make use of the elasticsearch engine.
I stopped workin on that when it did not connect at the first try.
The smbd log says it has authentication issues. So I guess there are only some settings missing and after that it is ready to use.
I need to correct this: the error is within the auth_audit.log (from TrueNAS).

I will have a look at it tonight.


correction was wrong, too...
 
Last edited:

volothamp

Explorer
Joined
Jul 28, 2019
Messages
72
I need to correct this: the error is within the auth_audit.log (from TrueNAS).

I will have a look at it tonight.

Same here, I've used Kibana and a client written by myself to query because I haven't upgraded yet. I don't know if anybody ever managed to query the ElasticSearch server using the Finder/Spotlight

Please tell us if you do :)
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
I digged through a lot of logs now and I am absolutely sure that samba is the cause, why it is not working.

If I start a spotlight search the query is built perfectly fine. It is sent to ES and is answered as well. At the next step, samba is unmarshalling the received blob. And here I see differences between my samba versions log output and the actual source code in github, as the line descriptions don't match. Here is the stacktrace:

Code:
[2022/09/29 15:56:50.691385, 10, pid=93402, effective(1000, 1000), real(0, 0), class=rpc_srv] ../../source3/rpc_server/mdssvc/mdssvc.c:1054(slrpc_fetch_query_results)
  fetch slq[0x5c0c5167,0x6b0000a0], start: 2022/09/29 15:56:50.293702, last_used: 2022/09/29 15:56:50.691383, expires: 2022/09/29 15:57:20.691383, query: '*=="SEARCHSTRINGFROMSPOTLIGHT*"cdw||kMDItemTextContent=="SEARCHSTRINGFROMSPOTLIGHT*"cdw'
[2022/09/29 15:56:50.691395, 10, pid=93402, effective(1000, 1000), real(0, 0), class=rpc_srv] ../../source3/rpc_server/mdssvc/mdssvc.c:1808(mds_dispatch)
  mds_dispatch: DALLOC_CTX(#1): {
      sl_array_t(#3): {
          uint64_t: 0x0000
          CNIDs: unkn1: 0xadd, unkn2: 0x6b0000a0
              DALLOC_CTX(#0): {
              }
          sl_filemeta_t(#0): {
          }
      }
  }
[2022/09/29 15:56:50.691430,  4, pid=93402, effective(1000, 1000), real(0, 0)] ../../source3/smbd/sec_ctx.c:444(pop_sec_ctx)
  pop_sec_ctx (1000, 1000) - sec_ctx_stack_ndx = 0
[2022/09/29 15:56:50.691437,  1, pid=93402, effective(1000, 1000), real(0, 0), class=rpc_parse] ../../librpc/ndr/ndr.c:484(ndr_print_function_debug)
       mdssvc_cmd: struct mdssvc_cmd
          out: struct mdssvc_cmd
              fragment                 : *
                  fragment                 : 0x00000000 (0)
              response_blob            : *
                  response_blob: struct mdssvc_blob
                      length                   : 0x00000068 (104)
                      size                     : 0x00010000 (65536)
                      spotlight_blob           : *
                          spotlight_blob: ARRAY(104)
                              [0]                      : 0x34 (52)
                              [1]                      : 0x33 (51)
                              [2]                      : 0x32 (50)

... Array ...

                              [103]                    : 0x00 (0)
              unkn9                    : *
                  unkn9                    : 0x00000000 (0)
[2022/09/29 15:56:50.691901, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2] ../../source3/smbd/smb2_server.c:2227(smbd_smb2_request_pending_timer)
  smbd_smb2_request_pending_queue: opcode[SMB2_OP_IOCTL] mid 32 going async
[2022/09/29 15:56:50.691909, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2_credits] ../../source3/smbd/smb2_server.c:980(smb2_set_operation_credit)
  smb2_set_operation_credit: smb2_set_operation_credit: requested 256, charge 1, granted 256, current possible/max 542/8192, total granted/max/low/range 7906/8192/33/7906
[2022/09/29 15:56:50.691916, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2] ../../source3/smbd/smb2_server.c:2338(smbd_smb2_request_pending_timer)
      state->vector[0/5].iov_len = 4
      state->vector[1/5].iov_len = 0
      state->vector[2/5].iov_len = 64
      state->vector[3/5].iov_len = 8
      state->vector[4/5].iov_len = 1
[2022/09/29 15:56:50.691955, 10, pid=93402, effective(1000, 1000), real(0, 0), class=rpc_srv] ../../source3/rpc_server/srv_pipe_hnd.c:419(np_read_recv)
  Received 160 bytes. There is no more data outstanding
[2022/09/29 15:56:50.691963, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2] ../../source3/smbd/smb2_ioctl_named_pipe.c:172(smbd_smb2_ioctl_pipe_read_done)
  smbd_smb2_ioctl_pipe_read_done: np_read_recv nread = 160 is_data_outstanding = 0, status = NT_STATUS_OK
[2022/09/29 15:56:50.691971, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2] ../../source3/smbd/smb2_ioctl.c:317(smbd_smb2_request_ioctl_done)
  smbd_smb2_request_ioctl_done: smbd_smb2_ioctl_recv returned 160 status NT_STATUS_OK
[2022/09/29 15:56:50.691980, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2] ../../source3/smbd/smb2_server.c:3847(smbd_smb2_request_done_ex)
  smbd_smb2_request_done_ex: mid [32] idx[1] status[NT_STATUS_OK] body[48] dyn[yes:160] at ../../source3/smbd/smb2_ioctl.c:410
[2022/09/29 15:56:50.691988, 10, pid=93402, effective(1000, 1000), real(0, 0), class=smb2_credits] ../../source3/smbd/smb2_server.c:980(smb2_set_operation_credit)
  smb2_set_operation_credit: smb2_set_operation_credit: requested 256, charge 1, granted 0, current possible/max 286/8192, total granted/max/low/range 7906/8192/33/7906

... from Array to here Information seems to be unimportant from my understanding ...

[2022/09/29 15:56:50.745574,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../libcli/http/http.c:199(http_parse_response_line)
  http_parse_response_line: Error parsing header

<----- ouch, that's bad

[2022/09/29 15:56:50.745727,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../lib/util/fault.c:172(smb_panic_log)
  ===============================================================
[2022/09/29 15:56:50.745739,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../lib/util/fault.c:176(smb_panic_log)
  INTERNAL ERROR: Signal 11: Segmentation fault in pid 93402 (4.15.9)
[2022/09/29 15:56:50.745754,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../lib/util/fault.c:181(smb_panic_log)
  If you are running a recent Samba version, and if you think this problem is not yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting
[2022/09/29 15:56:50.745771,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../lib/util/fault.c:182(smb_panic_log)
  ===============================================================
[2022/09/29 15:56:50.745793,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../lib/util/fault.c:184(smb_panic_log)
  PANIC (pid 93402): Signal 11: Segmentation fault in 4.15.9
[2022/09/29 15:56:50.746522,  0, pid=93402, effective(1000, 1000), real(0, 0)] ../../lib/util/fault.c:288(log_stack_trace)
  BACKTRACE: 27 stack frames:
   #0 0x801d35087 <log_stack_trace+0x37> at /usr/local/lib/samba4/libsamba-util.so.0
   #1 0x801d35161 <smb_panic+0x11> at /usr/local/lib/samba4/libsamba-util.so.0
   #2 0x801d34ed9 <fault_setup+0xa9> at /usr/local/lib/samba4/libsamba-util.so.0
   #3 0x805dd76d0 <pthread_sigmask+0x540> at /lib/libthr.so.3
   #4 0x805dd6c8f <pthread_setschedparam+0x82f> at /lib/libthr.so.3
   #5 0x7ffffffff8a3 <???> at ???
   #6 0x804c412b1 <__realloc+0xc21> at /lib/libc.so.7
   #7 0x806765c77 <http_read_response_send+0xae7> at /usr/local/lib/samba4/private/libhttp-samba4.so
   #8 0x806b9801a <tstream_readv_pdu_send+0x14a> at /usr/local/lib/samba4/private/libsamba-sockets-samba4.so
   #9 0x806b98757 <tstream_writev_queue_recv+0x1c7> at /usr/local/lib/samba4/private/libsamba-sockets-samba4.so
   #10 0x806b976cf <tstream_readv_send+0x22f> at /usr/local/lib/samba4/private/libsamba-sockets-samba4.so
   #11 0x80408f989 <tevent_common_invoke_immediate_handler+0xc9> at /usr/local/lib/samba4/private/libtevent.so.0
   #12 0x80408fa3c <tevent_common_loop_immediate+0x1c> at /usr/local/lib/samba4/private/libtevent.so.0
   #13 0x8040953ab <_tevent_add_aio_fsync+0x2cb> at /usr/local/lib/samba4/private/libtevent.so.0
   #14 0x804094120 <tevent_signal_get_tag+0x230> at /usr/local/lib/samba4/private/libtevent.so.0
   #15 0x80408e4e1 <_tevent_loop_once+0xe1> at /usr/local/lib/samba4/private/libtevent.so.0
   #16 0x80408e742 <tevent_common_loop_wait+0x32> at /usr/local/lib/samba4/private/libtevent.so.0
   #17 0x8040941a0 <tevent_signal_get_tag+0x2b0> at /usr/local/lib/samba4/private/libtevent.so.0
   #18 0x8019b9f13 <smbd_process+0x7b3> at /usr/local/lib/samba4/private/libsmbd-base-samba4.so
   #19 0x102cec3 <main+0x44b3> at /usr/local/sbin/smbd
   #20 0x80408f4cd <tevent_common_invoke_fd_handler+0x9d> at /usr/local/lib/samba4/private/libtevent.so.0
   #21 0x80409575d <_tevent_add_aio_fsync+0x67d> at /usr/local/lib/samba4/private/libtevent.so.0
   #22 0x804094120 <tevent_signal_get_tag+0x230> at /usr/local/lib/samba4/private/libtevent.so.0
   #23 0x80408e4e1 <_tevent_loop_once+0xe1> at /usr/local/lib/samba4/private/libtevent.so.0
   #24 0x80408e742 <tevent_common_loop_wait+0x32> at /usr/local/lib/samba4/private/libtevent.so.0
   #25 0x8040941a0 <tevent_signal_get_tag+0x2b0> at /usr/local/lib/samba4/private/libtevent.so.0
   #26 0x102b42f <main+0x2a1f> at /usr/local/sbin/smbd


So either our/my samba version is out of date or I picked a wrong version of ES. I don't want to investigate further today. Maybe at the weekend. Maybe I should report a bug, Idk...

Greetings :)
 

volothamp

Explorer
Joined
Jul 28, 2019
Messages
72
I digged through a lot of logs now and I am absolutely sure that samba is the cause, why it is not working.



So either our/my samba version is out of date or I picked a wrong version of ES. I don't want to investigate further today. Maybe at the weekend. Maybe I should report a bug, Idk...

Greetings :)

Please post your TrueNAS version and your samba version.
 

phradr

Dabbler
Joined
Sep 27, 2022
Messages
49
TrueNAS-13.0-U2
Updated to 13.1-RELEASE-p1 n245406-814eb095751

with

Samba 4.15.9

FSCrawler is 2.9
ES 7.17.3
 

kapitainsky

Dabbler
Joined
Sep 30, 2022
Messages
46
The same results as @phradr

TrueNAS 13.1-RELEASE-p1 n245406-814eb095751
Samba 4.15.9

ES 7.17.3

I have tried differen crawler but the same results -

FSCrawler 7.2.10

I can see from TrueNAS that ES is working. I test it with:

curl -H 'Content-Type: application/json' -X GET "http://jailIP:9200/test/_search?pretty"

but in /var/log/samba4/log.smbd

Code:
[2022/10/02 17:56:29.298324,  0] ../../libcli/http/http.c:199(http_parse_response_line)
  http_parse_response_line: Error parsing header
[2022/10/02 17:56:29.299070,  0] ../../lib/util/fault.c:172(smb_panic_log)
  ===============================================================
[2022/10/02 17:56:29.299136,  0] ../../lib/util/fault.c:176(smb_panic_log)
  INTERNAL ERROR: Signal 11: Segmentation fault in pid 8094 (4.15.9)
[2022/10/02 17:56:29.299162,  0] ../../lib/util/fault.c:181(smb_panic_log)
  If you are running a recent Samba version, and if you think this problem is not yet fixed in the latest versions, please consider reporting this bug, see https://wiki.samba.org/index.php/Bug_Reporting
[2022/10/02 17:56:29.299279,  0] ../../lib/util/fault.c:182(smb_panic_log)
 
Top