Faster Searching from Mac Spotlight or Finder of Freenas Files

Henning Kessler

Contributor
Joined
Feb 10, 2015
Messages
143
On the mount points, I think I'm up and running. What's the best way to verify everything is working?
Well if everything would work you could use the mdfind command from the cli. Here is a manpage. But as I mentioned above: even if I configured the samba config the way it is told in the wiki from samba I can‘t see any request from the host to the elasticsearch and the mdfind command isn‘t working as well
Code:
USER@SERVERNAME[~]% sudo mdfind SERVERNAME SHARENAME '*=="Samba"'                                                           11:01:40
Enter administrator@DOMAIN's password:
main: Cannot connect to server: NT_STATUS_LOGON_FAILURE
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Well if everything would work you could use the mdfind command from the cli. Here is a manpage. But as I mentioned above: even if I configured the samba config the way it is told in the wiki from samba I can‘t see any request from the host to the elasticsearch and the mdfind command isn‘t working as well
Code:
USER@SERVERNAME[~]% sudo mdfind SERVERNAME SHARENAME '*=="Samba"'                                                           11:01:40
Enter administrator@DOMAIN's password:
main: Cannot connect to server: NT_STATUS_LOGON_FAILURE
To clarify, where are you running the mdfind command? Is that from the client (MacOS), TrueNAS, or Jail cli?
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
From the client, i.e. MacOS. That's the point. Have fast search via an indexed database over SMB ...
 

Henning Kessler

Contributor
Joined
Feb 10, 2015
Messages
143
Okay I guess I was wrong but I tried it on the server, maybe that was the reason why it failed LOL. But from a macOS client it would make no sense as well as there is a native mdfind command which works different... I tried to use spotlight from a Mac that has mounted that indexed share as well but I could not see any traffic to the Elasticsearch server while that search happened. I think there is something missing or wrong in the chain...
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
As far as I understand the architecture the Mac (client) should communicate with Samba and Samba with Elasticsearch. The Mac does not know about Elastic, it's relying on the server for sure. A Windows or Mac OS server might use a different server-side technology altogether.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Okay I guess I was wrong but I tried it on the server, maybe that was the reason why it failed LOL. But from a macOS client it would make no sense as well as there is a native mdfind command which works different... I tried to use spotlight from a Mac that has mounted that indexed share as well but I could not see any traffic to the Elasticsearch server while that search happened. I think there is something missing or wrong in the chain...
As far as I understand the architecture the Mac (client) should communicate with Samba and Samba with Elasticsearch. The Mac does not know about Elastic, it's relying on the server for sure. A Windows or Mac OS server might use a different server-side technology altogether.
This is exactly where I was confused as well. When running mdfind from the Mac Client, I get different prompts. What @Henning Kessler posted was the output I received when running it from the TrueNAS box. @Patrick M. Hausen The architecture as you describe it makes sense. The Mac client knows nothing about the elasticsearch running, as that seems logical it should be transparent to the client.

@Henning Kessler I think we are in the same place at this point. I have my shares mounted on my Mac client, but I'm not sure how to verify things are working or not right now. Only thing I did try was running top, and elasticsearch is listed running java. That's about where my abilities stop.
 

Henning Kessler

Contributor
Joined
Feb 10, 2015
Messages
143
I was running tcpdump on the bridge interface of the vnet jails listening to port 9200. I could see the packets from the fscrawler jail to the elasticsearch jail but never something else. In my understanding i must have seen traffic from the host in case a spotlight search was happening. That is the reason why I think the problem must be somewhere in the samba configuration...
 

Henning Kessler

Contributor
Joined
Feb 10, 2015
Messages
143
I found an interesting bit in the manpage of smb.conf which was not mentioned in the wiki article
Code:
      spotlight (S)

           This parameter controls whether Samba allows Spotlight queries on a
           share. For controlling indexing of filesystems you also have to use
           Tracker's own configuration system.

           Spotlight has several prerequisites:

                  •   Samba must be configured and built with Spotlight
                      support.

                  •   The mdssvc RPC service must be enabled, see below.

                  •   Tracker integration must be setup and the share must be
                      indexed by Tracker.

           For a detailed set of instructions please see
           https://wiki.samba.org/index.php/Spotlight.

           The Spotlight RPC service can either be enabled as embedded RPC
           service:

               [Global]
               rpc_server:mdsvc = embedded

           Or it can be run in a separate RPC service daemon:

               [Global]
               rpc_server:mdssd = fork
               rpc_server:mdsvc = external

           Default: spotlight = no


But even after setting the rpc_server:mdssd = embedded in [Global[ still not working ...
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I came across an error on my side that was from upper case characters in my jobname. I didn't initially get any indication there was a problem, but in digging through the logs I saw it kick an error. I've renamed my job, but still no indication that anything is occurring other than the fscrawler logs saying it connected to an ElasticSearch node.

Any more progress on your end?
 

Henning Kessler

Contributor
Joined
Feb 10, 2015
Messages
143
you can check your Elasticsearch index. FSCrawler generates two indexes. One for the files and the other for the folders. Here is how you can check the indexes:
Code:
curl "http://IPADDRESSS:9200/_aliases?pretty=true" 

result in my case in:
Code:
{
  "test" : {
    "aliases" : { }
  },
  "test_folder" : {
    "aliases" : { }
  }
}

to query the content of the index test_folder:

Code:
curl -H 'Content-Type: application/json' -X GET "http://IPADDRESS:9200/test_folder/_search?pretty"


But I must say that my file index seams to be really incomplete, FSCrawler does not seam to index all files even if you set it not to exclude anything.

Aside from that I still not have seen any request from Samba to the indexes...
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Interesting... I'm seeing very similar results. It looks like I have a small handful of file indexes present, however what's interesting on mine is that they all appear to be in one path.

Another thing that I noticed is that from what I can tell, my TrueNAS box cannot connect to port 9200 of my jail, which I think is a firewall issue. Sockstat on the jail only shows local connections, which I think are from fscrawler to ElasticStack, however if I understand correctly, Samba will need to communicate to the ElasticStack instance on port 9200 as well. Off to see if I can figure out how to open that port to see if anything changes.

Edit: Sockstat actually looks correct, however I cannot telnet to my jail from my TrueNAS box on port 9200 or 9300, which appear to be open to all connections from the Sockstat output.

Code:
root@crawler:~ # sockstat -l
USER     COMMAND    PID   FD PROTO  LOCAL ADDRESS         FOREIGN ADDRESS
elasticsearch java  86153 45 stream (not connected)
elasticsearch java  86153 253 stream(not connected)
elasticsearch java  86153 326 tcp6  ::1:9300              *:*
elasticsearch java  86153 327 tcp4  127.0.0.1:9300        *:*
elasticsearch java  86153 329 tcp6  ::1:9200              *:*
elasticsearch java  86153 330 tcp4  127.0.0.1:9200        *:*
root     syslogd    86112 5  dgram  /var/run/log
root     syslogd    86112 6  dgram  /var/run/logpriv
root@crawler:~ #


From my TrueNAS box, I see "connection refused":
Code:
root@TrueNAS[~]# telnet 172.16.110.16 9200
Trying 172.16.110.16...
telnet: connect to address 172.16.110.16: Connection refused
telnet: Unable to connect to remote host
root@TrueNAS[~]# telnet 172.16.110.16 9300
telnet: connect to address 172.16.110.16: Connection refused
telnet: Unable to connect to remote host
 
Last edited:

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
There is no firewall active on FreeNAS/TrueNAS or in any jail, unless you yourself activated one by undocumented and unsupported means. So that is probably not your problem.

If you enter inside the jail netstat -na | grep LISTEN you will probably find that port 9200 is only active for 127.0.0.1. This implies you are using VNET for your jail which you should if you want to access Elasticsearch from outside.

If I guessed correctly, you need to set the Elasticsearch listen address somehow. Possibly the configuration file or a parameter that goes into /etc/rc.conf inside your jail. I can have a look later this day.

HTH,
Patrick
 
Last edited:

TravisT

Patron
Joined
May 29, 2011
Messages
297
network.host and http.port were commented out in my elasticsearch.yml file. Uncommented them, ensured the loopback interface/port 9200 defined and restarted elasticsearch service.

I can now curl "http://localhost:9200" from the jail, but still cannot access via TrueNAS shell or anywhere else on my network.

rc.conf was pretty empty, but didn't see anything that looked out of place.

Edit: also confirmed via netstat that the loopback interface of the jail is listening on ports 9200 and 9300.
 

Constantin

Vampire Pig
Joined
May 19, 2017
Messages
1,827
I wonder as well to what extent this problem can be solved by throwing hardware at it, i.e. adding an sVDEV with metadata only. I notice a dramatic improvement re: responsiveness when my metadata-only L2ARC cache gets "hot". Problem is, the metadata sVDEV pool approach only really works when you set up a new pool and dump new data into it. Extant pools will not automatically shift their metadata to the sVDEV. This would be a nice-to-have as part of a scrub when a FreeNAS pool is upgraded to TrueNAS.

There are several Mac-based indexing programs out there that are less problematic than the built-in OSX spotlight re: storage of indexes. This is a particular issue in case you have encrypted sparsebundles / images - the indices may be saved "outside" the sparsebundle / image, allowing an attacker an insight into the contents of said sparsebundle. Perhaps Apple has closed this loophole by now.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
network.host and http.port were commented out in my elasticsearch.yml file. Uncommented them, ensured the loopback interface/port 9200 defined and restarted elasticsearch service.
You should change that "loopback":9200 to 0.0.0.0:9200 and make sure to have a VNET jail without NAT. Only then will you be able to connect to <jail-ip-address>:9200 from the outside. You can never connect to any service binding to 127.0.0.1 from anywhere but the system running the service itself.

According to the documentation that should read:
Code:
network.host: 0.0.0.0
http.port: 9200
 
Last edited:

TravisT

Patron
Joined
May 29, 2011
Messages
297
You should change that "loopback":9200 to 0.0.0.0:9200 and make sure to have a VNET jail without NAT. Only then will you be able to connect to <jail-ip-address>:9200 from the outside. You can never connect to any service binding to 127.0.0.1 from anywhere but the system running the service itself.

According to the documentation that should read:
Code:
network.host: 0.0.0.0
http.port: 9200

I came across the same documentation when I was digging. From memory, I thought when you bound a service to the loopback interface, it could be accessed from any IP, but maybe different systems respond to that differently. Or maybe my memory doesn't serve me correctly...

Anyway, I did try that previously and tested again to make sure I didn't mess something up the first time.

When the network.host address is set to the loopback interface (127.0.0.1), I can connect from within the jail, but not from outside.
When the network.host address is set to 0.0.0.0, I can't do either. A few seconds after starting the service (set to 0.0.0.0), I get the following error:

Code:
ERROR: [1] bootstrap checks failed
[1]: the default discovery settings are unsuitable for production use; at leastone of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elasticsearch.log


Logs contain these warnings (other entries were info):
Code:
[2021-02-18T13:39:34,067][WARN ][o.e.b.BootstrapChecks    ] [crawler] the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
[2021-02-18T13:42:37,028][WARN ][o.e.g.DanglingIndicesState] [crawler] gateway.auto_import_dangling_indices is disabled, dangling indices will not be automatically detected or imported and must be managed manually


There don't appear to be any dangling indices, but I can only check after reverting the network.host back to 127.0.0.1.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Also, I am set to VNET without NAT on my jail. Although a simple test, I can ping my VNET address from within the jail, from TrueNAS shell and from elsewhere on my network. I can also do the reverse from the jail.
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
I came across the same documentation when I was digging. From memory, I thought when you bound a service to the loopback interface, it could be accessed from any IP
Definitely not. Anything bound to a loopback interface is only accessible from that very same host.

Please post the output of ifconfig -a from inside the jail.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Code:
root@crawler:~ # ifconfig -a
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
        groups: pflog
epair0b: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=8<VLAN_MTU>
        ether 02:25:90:29:a4:78
        hwaddr 02:2b:78:f6:93:0b
        inet 172.16.110.16 netmask 0xffffff00 broadcast 172.16.110.255
        groups: epair
        media: Ethernet 10Gbase-T (10Gbase-T <full-duplex>)
        status: active
        nd6 options=1<PERFORMNUD>
 

Patrick M. Hausen

Hall of Famer
Joined
Nov 25, 2013
Messages
7,737
So if you can ping 172.16.110.16 from another system in your LAN and netstat -na | grep LISTEN shows port 9200 with tcp4 at *, you really should be able to connect to Elastic on that port.
 
Top