This section groups all OCR specific configurations.


By default Papermerge DMS will use language specified with this option to perform OCR. Change this value for language used by majority of your documents. For detailed list of three letter codes see 639-2/T column from ISO 639 2.

Example as environment variable:


Example in toml configuration file:


Default value is “deu” (German language).



This option may be defined only in toml configuration file

Defines all languages available for OCR. This option is defined as inline table where key is ISO 639 2 code and value is human text name for language.


languages = { heb = "hebrew", jpn = "japanese"}

Note that both hebrew and japanes language data for tesseract must be installed. You can check Tesseract’s available languages with following command:


languages value must be written in one line! This is requirement of the toml inline table format.

List available languages
$ tesseract --list-langs

Default value

languages = { deu = "Deutsch", eng = "English" }

See Adding OCR Languages to the Docker Image for detailed example of using this option.