Automates are very handy if you want to apply automatically certain actions on incoming documents. For example, if document contains specific keywords (like groceries store name) then automatically add tag “groceries” on it. Even better use case would be: if document contains specific keywords apply “groceries” tag on the document AND move it from Inbox tofolder. Incoming documents are those found in your Inbox folder.
If you want to see automates in action watch following screencast. The rest of this chapter explains automate feature in detail.
With Automation feature you can automate repetitive tasks like:
moving documents into their destination folder
assigning specific tags to the document
Each automate instance consists of:
name or a title - give it whatever name you like
keywords - terms or words to look up in the document to figure out if current automate is applicable for given document
matching algorithm - method used to decide if document matches the automate
case sensitivity attribute - are keywords specified in match case sensitive?
(optional) destination folder - where shall it move the matched document?
(optional) tags - which tags shall it associate to the matched document?
Note that last two attributes of Automates - destination folder and tags - are optional. You may indicate one of them, both or neither.
Document Matching Algorithms¶
In order to decide if automate instance applies to current document - it will look for certain keywords in the document. For example if document contains capital case REWE, then this document must be routed to folder Expenses/Groceries; if document contains word Deutschlandradio (german word which translates to english as German radio), then it will be routed to ARD ZDF Briefe
It is crucial to understand that matching is per Page. Thus, statement match a document is not entirely correct. Automation processes is triggered every time when OCR for certain page completes. OCRed page is sent to automation module and Papermerge will try to match each automate instance on it. In case there is a match - it is considered that document matched automate criteria, although technically correct is - page of respective the document matched!
There are four different ways to perform a match:
Any matching algorithm, document matches if any of mentioned
keyword will match. With
All, document matches if all mentioned
keywords are found in document. Keywords order does not matter.
Literal matching algorithm, text you enter must appear in the
document exactly as you’ve entered it.
You can use
Regular Expression for matching criteria. Regular expressions
is a general programming method of text matching. Computer programmers usually
know what it means.
Matching keywords should be separated by one or more spaces.
Case Sensitivity Attribute¶
If case sensitivity attribute is checked, the matching algorithms will look up for words occurrences with exact same letter case as matching keywords
For example if “schnell” is mentioned as keyword and Is case sensitive is checked then occurrences in the document of terms “SCHNELL”, “Schnell”, “schneLL” will be ignored because of mismatched letter case i.e. document will not match automate criteria.
On the other hand, if very same keyword “schnell” is used but Is case sensitive attributed is unchecked, then any of following terms “SCHNELL”, “Schnell”, “schneLL” will match i.e. document will match automate criteria.
Inbox + Automates¶
Automates run only for documents in
Imported documents from local watch directory or
from email account end up in your
Papermerge will apply automates only on the documents in
regardless where those documents were imported from. The side effect of this
feature is that automates will run on documents in
Inbox even if you
uploaded them manually - this one is a very useful trick to test your
There is a good reason why automates apply only on the documents from
Inbox. The reason is that in
Inbox documents may disappear. In
other words - it is acceptable for documents to suddenly move from
Inbox to another folder - due to automation match.
If automates would be applied on any folder - then imagine
how confused you might be if documents would unexpectedly disappear from
your current folder (due to automation match)!
Automates and UI Logs¶
You can check which automate matched specific document by looking at UI Logs:
In UI Log entry you can see document’s name, page number and document id on which Automates were applied (remember, automates are applied per Page!). Also, you can see that text which was extracted from that document:
UI Logs are very convient to see the extracted text from the document. Depending on the quality of the scan, extracted text may or may not exactly match textual content of the document. For instance in figure below, OCR engine extracted text “SCHNEIL” although the actual text on the receipt was “SCHNELL”. Use UI Logs to spot such errors and adjust MATCH term.
In order to check which Automate matched this document/page, you need to scroll to the very bottom of the message:
Troubleshooting Mismatched Automates¶
Writing Automates involves little bit of guesswork. Even if you know for sure
that certain words will occur in the document, it will take couple of trial
and error cycles until you’ll end up with correct
To support this try-error cycle, with Papermerge you can manually trigger automates. It is very intuitive how re-run automates:
Select Automate you wish to run
Run selected automatesin the action drop down on the right
Let’s consider an example which will illustrate how Automates are
troubleshooted. The goal is to create an Automate which will automatically
move Schnell receipts to
For this purpose, following Automate was created:
Match field has as value one single lowercase keyword “schnell”.
Is case sensitive field is checked.
For the beginning, two receipts shown in Figure 7 are uploaded to
Automates are triggered for all incoming documents, even if they are manually uploaded to
folder. However, if you cut/paste document(s) from other folder to
- automates won’t run.
To make sure that automates ran, check last entries in
UI Logs. Also, in
UI Logs you will see actual extracted text the automates were compared
with. For Automate and receipts from Figure 7 - uploaded receipts still will
Inbox. The reason of that, is checked
Is case sensitive
attribute: in extracted text for “schnell” all uppercase while our keyword is
Let’s try again. Use following steps:
Is case sensitive attribute
Save changes schnell receipts automate
Select schnell receipts automate
run Automate again using action drop down menu
You will notice that one of the two receipts moved indeed from
Inbox folder to “Groceries” folder. Also it has applied all three tags as depicted in Figure 8.
Why other receipt didn’t match ? Let’s have a closer look the UI Logs. Open last UI Log entry which starts with Running automates for document brother_004026.pdf text:
You see in Figure 9 above that OCR engine got confused and extracted slightly wrong text. To take into account this error as well, add to
Match field of Automate “schneil” keyword as well:
Matching Algorithm is
Any, this means that Automate will
match if any mentioned keywords will match. After saving and
rerunning automate - second receipt is successfully moved to Groceries folder
and it has all three tags applied as you can see in picture below:
Watch following screencast to see this troubleshooting use case in action: