Settings

Papermerge loads its settings from a configurations file. At first it tries to read following files:

/etc/papermerge.conf.py
papermerge.conf.py - from current project directory

If neither of above files exists it will check environment variable PAPERMERGE_CONFIG_FILE. In case environment variable PAPERMERGE_CONFIG_FILE points to an existing file - it will try to read its configurations from there.

If all above attempts fail, Papermerge will use default configurations values and issue you a warning. If you want to get rid of warning message, just create an empty configuration file papermerge.conf.py in project root directory (right next to papermerge.conf.py.example) or in location /etc/papermerge.conf.py.

Configuration file uses python syntax.

Django Settings

Papermerge is based on Django Web Development Framework. This means basically that if you'll know how Django projects are configured - you'll be more familiar with how papermerge's configuration internals. One particularly important thing to be aware of is the [DJANGO_SETTINGS_MODULE] (https://docs.djangoproject.com/en/3.1/topics/settings/#designating-the-settings) environment variable - which is Django specific. Learn more about Django's settings from Django documentation.

PAPERMERGE_ Prefix

There is slight difference where you place papermerge settings enumerated below. In short, when placed in papermerge.conf.py file, they don't need PAPERMERGE_ prefix, while if you place very same configuration in django settings file - it needs PAPERMERGE_ prefix.

Papermerge settings can be either in:

papermerge.conf.py file
django settings file (the one referenced by DJANGO_SETTINGS_MODULE environment variable)

In papermerge.conf.py file configuration settings are without PAPERMERGE_ prefix, because all (well, 90%) of them are papermerge specific. In django settings file however, there are all sort of settings - for celery (prefixed with CELERY_), for allauth (prefixed with ACCOUNT_). Respectively settings for specific for papermerge are prefixed as well. Thus, any settings listed below, when added directly to django settings file - needs PAPERMERGE_ prefix.

Configuration file papermerge.conf.py is there for convenience. Most of the time you will need only that file.

Main App, Worker or Both?

Some configuration variables are for worker only (the part which OCRs the documents, imports documents form local directory or fetches them from imap/email account), some configuration variables are for main app only and some are for both. This distinction becomes aparent in case you deploy main app and worker on separate hosts; another scenario when this distinction is important in case of containerized deployment via docker - it so, because usually main app and worker will run in different containers - and thus will have different copies of papermerge.conf.py file.

The settings below specify for whom configuration settings is addressed. When it says: "context: worker" - it means variable applies only in context of worker i.e. it needs to be changed in papermerge.conf.py on worker instance/host/container.

When settings description states "context: main app, worker" - it means configuration needs to be changed on both - main app and worker in order to function properly.

Some of the most used configurations which you might be interest in:

:ref:media_dir - location where all uploaded/imported documents are stored
:ref:ocr_languages - user can select one of those languages to perform :ref:OCR <ocr>
:ref:ocr_default_language - default language for :ref:OCR <ocr>

Paths and Folders

DBDIR

/path/to/papermerge/sqlite/db/
context: main app

Defines location where db.sqlite3 will be saved. By default uses project's local directory.

Example:

DBDIR = "/opt/papermerge/db/"

MEDIA_DIR

/path/to/media/
context: main app, worker

Defines directory where all uploaded documents will be stored.

By default uses a folder named media in project's local directory.

STATIC_DIR

/path/to/collected/static/assets/
context: main app

Location where all static assets of the project Papermerge project (javascript files, css files) will be copied by ./manage collectstatic command.

By default uses a folder named static in project's local directory.

Example:

STATIC_DIR = "/opt/papermerge/static/"

Document Importer

Importer is a command line utility, which you can invoke with ./manage.py importer, used to import all documents from local directory.

IMPORTER_DIR

/path/where/documents/will/be/imported/from/
context: worker

Location on local file system where Papermerge will try to import documents from.

Example:

IMPORTER_DIR = "/opt/papermerge/import/"

OCR

OCR_LANGUAGES

context: main app, worker

Addinational languages for text OCR. A dictionary where key is ISO 639-2/T code and value human text name for language

Example:

OCR_LANGUAGES = {
    'heb': 'hebrew',
    'jpn': 'japanese'
}

Note that both hebrew and japanes language data for tesseract must be installed. You can check Tesseract's available languages with following command:

$ tesseract --list-langs

Default value for OCR_LANGUAGES uses following value:

OCR_LANGUAGES = {
    "deu": "Deutsch",  # German language
    "eng": "English",
  }

OCR_DEFAULT_LANGUAGE

context: main app, worker

By default Papermerge will use language specified with this option to perform OCR. Change this value for language used by majority of your documents.

Example:

OCR_DEFAULT_LANGUAGE = "spa"

Default value is "deu" (German language).

I18n and Localization

LANGUAGE_CODE

context: main app

This option specifies language of user interface. There are two options:

en - for user interface in English language
de - for user interface in German language

English is default fallback i.e. if you don't specify anything or specify unsupported language then English will be used. Instead of en you can use en-US, en-UK etc. Instead of de you can use de-DE, de-AT etc. See here full least of all available language codes.

Default value: en

LANGUAGE_FROM_AGENT

If is set to True, will use same language code as your Web Browser (agent) does. Browsers send 'Accept-Language' header with their locale. For more, read here

If True - will override LANGUAGE_CODE option. This means that with LANGUAGE_FROM_AGENT=True in whatever locale settings your Web Browser runs - same will be used by Papermerge instance.
If False - language code specified in LANGUAGE_CODE option will be used and 'Accept-Language' header in browser will be ignored.

Default value: False

Database

By default, Papermerge uses SQLite3 database (which is a file located in :ref:db_dir). Alternatively you can use a PostgreSQL or MySQL/MariaDB database. Following are options for PostgreSQL and MySQL/MariaDB database connections.

DBTYPE

context: main app

DB type (if different from SQLite3). For PostgreSQL database use one of following values:

pg
postgre
postgres
postgresql

For MySQL/MariaDB database (they share same database backend) use one of following values:

my
mysql
maria
mariadb

Example:

  DBTYPE = "mysql"

DBUSER

context: main app

DB user used for database connection.

Example:

DBUSER = "john"

DBNAME

context: main app

Database name. Default value is papermerge.

DBHOST

context: main app

Database host. Default value is localhost.

DBPORT

context: main app

Database port. Port must be specified as integer number. No string quotes.

Example:

DBPORT = 5432

Default value is 5432 for PostgreSQL and 3306 for MySQL/MariaDB.

DBPASS

context: main app

Password for connecting to database Default value is empty string.

EMail

You can import documents directly from email/IMAP account. All EMail importer settings must be defined in papermerge.conf.py on worker side. Read details about ingesting documents via IMAP account in document consumption chapter <importer_imap>.

IMPORT_MAIL_HOST

context: worker

IMAP Server host.

IMPORT_MAIL_USER

context: worker

Email account/IMAP user. IMAP user needs read and write access to IMAP "INBOX" folder.

IMPORT_MAIL_PASS

context: worker

Email account/IMAP password.

IMPORT_MAIL_INBOX

context: worker

IMAP folder to read email from. Default value for this settings is "INBOX".

IMPORT_MAIL_BY_USER

context: worker

Whether to allow users to receive in their inbox folder emails sent from their own email address. This capability of assigning attached documents to correct user's inbox is called email routing and is described at length in email_routing.

IMPORT_MAIL_BY_SECRET

context: worker

Whether to allow users to receive in their inbox folder emails containing their own secret. This capability of assigning attached documents to correct user's inbox is called email routing and is described at length in email_routing.

IMPORT_MAIL_DELETE

context: worker

Whether to delete emails after processing.

Binary Dependencies

Papermerge uses a number of open source 3rd parties for various purposes. One of the most obvious example is tesseract - used to :ref:OCR <ocr> documents (extract text from binary image file). Another, less obvious example, is pdfinfo utility provided by poppler-utils package: pdfinfo is used to count number of pages in pdf document. Configurations listed below allow you to override path to specific dependency.

BINARY_OCR

context: worker

Full path to tesseract binary/executable file. Tesseract is used for :ref:OCR <ocr> operations - extracting of text from binary image files (jpeg, png, tiff). Default value is:

BINARY_OCR = "/usr/bin/tesseract"

BINARY_FILE

context: main app, worker

File utility used to find out mime type of given file. Default value is:

BINARY_FILE = "/usr/bin/file"

BINARY_CONVERT

context: main app, worker

Convert utility is provided by ImageMagick package. It is used for resizing images. Default value is:

BINARY_CONVERT = "/usr/bin/convert"

BINARY_PDFTOPPM

context: main app, worker

Provided by Poppler Utils. Used to extract images from PDF file. Default value is:

BINARY_PDFTOPPM = "/usr/bin/pdftoppm"

BINARY_PDFINFO

context: main app, worker

Provided by Poppler Utils. Used to get page count in PDF file. Default value is::

BINARY_PDFINFO = "/usr/bin/pdfinfo"

BINARY_STAPLER

context: main app, worker

Provided by stapler. This external tool is used to reorder, cut/paste, delete pages within PDF document.

Default value is:

BINARY_STAPLER = "/usr/bin/stapler"

Depending on you system, and the way you installed stapler - you may want to adjust BINARY_STAPLER path.