Featured image of post setup full text search for Nextcloud on docker

setup full text search for Nextcloud on docker

Setting up Nextcloud full-text search isn’t very clear, especially with Docker, so in this post I’ll explain how to set up full-text search with Elasticsearch and Tesseract using Nextcloud’s official Docker images.

add new tesseract to your Nextcloud container

Normally we would need to create a custom image for this, but fortunately modzilla99 and Schw3pps have found a way to add additional packages without needing a custom image, by adding a custom command to the container.

latest/apache image

1
command: sh -c "apt update && apt-get install -y --no-install-recommends tesseract-ocr tesseract-ocr-eng tesseract-ocr-$(YOUR_THREE_LETTER_LANGUAGE_CODE) && /entrypoint.sh apache2-foreground"

TIP: if you want to add Cron, you don’t need a separate container, just add supervisor

1
command: sh -c "apt update && apt-get install -y --no-install-recommends tesseract-ocr tesseract-ocr-eng tesseract-ocr-$(YOUR_THREE_LETTER_LANGUAGE_CODE) && mkdir -p /var/log/supervisord && mkdir -p /var/run/supervisord supervisor && supervisord -c /supervisord.conf"

make sure to mount this file at /supervisord.conf

fpm-alpine image

1
command: -c "apk add --no-cache tesseract-ocr tesseract-ocr-data-eng tesseract-ocr-data-$(YOUR_THREE_LETTER_LANGUAGE_CODE); /entrypoint.sh php-fpm"

and if you have a separate Cron container, add this to it:

1
command: -c "apk add --no-cache tesseract-ocr tesseract-ocr-data-eng tesseract-ocr-data-$(YOUR_THREE_LETTER_LANGUAGE_CODE); /cron.sh"

Elasticsearch setup

docker-compose.yml

Add this to your Nextcloud compose file

To avoid any problems, make the Nextcloud container depends on Elasticsearch, to make sure it’s started before Nextcloud

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.10
    networks:
      default:
    # ports:
    # 127.0.0.1:9200:9200 #only needed if you are connecting through a docker network
    command: sh -c "bin/elasticsearch-plugin install --batch ingest-attachment; /bin/tini -s /usr/local/bin/docker-entrypoint.sh eswrapper"
    restart: always
    environment:
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx2048m"
    user: 1000:1000
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - elasticsearch:/usr/share/elasticsearch/data

volumes:
    elasticsearch:

Using the same trick as before, we can install the ingest-attachment plugin without the need for a custom image. There will be an error when the container is restarted as it tries to install the plugin again, but it should continue to work fine.

Nextcloud setup

Select Elasticsearch as the search platform, then add the server address. If your language requires a custom tokenizer, feel free to change it.

The other options below aren’t very important and can be configured to your liking.

Start indexing!

Test setup

1
./occ fulltextsearch:test

index all files

1
./occ fulltextsearch:index

automatic indexing

The official wiki has a section on using cron for this: https://github.com/nextcloud/fulltextsearch/wiki/Basic-Installation#live-index-service

But since we are using docker, probably with supervisord added, robeatoz has found a way to add fulltextsearch:live to the supervisord.conf file.

fulltextsearch.sh

create a new file called fulltextsearch.sh

1
2
3
4
5
6
7
#!/bin/sh
# Stop all running indexes
php /var/www/html/occ fulltextsearch:stop
# Start live index
php /var/www/html/occ fulltextsearch:live

# More information: https://github.com/nextcloud/fulltextsearch/wiki/Commands

supervisord.conf

Add this to the end of the file

1
2
3
4
5
6
7
[program:fulltextsearch_index_live]
stdout_logfile=/dev/stdout
stdout_logfile_maxbytes=0
stderr_logfile=/dev/stderr
stderr_logfile_maxbytes=0
user=www-data
command=/bin/sh /fulltextsearch.sh

docker-compose.yml

mount fulltextsearch.sh into the container

1
2
3
volumes:
  - ....
  - ./fulltextsearch.sh:/fulltextsearch:ro

And you are done!

sources

https://github.com/nextcloud/docker/issues/1724

FarisZR
Built with Hugo
Theme Stack designed by Jimmy