Version bump: 2.5.0

pep8
Rework user hack for "login-free" sessions #394
2025-12-22 19:11:22 +00:00 · 2018-10-07 16:30:36 +01:00 · 2018-10-07 16:30:02 +01:00 · 2018-10-07 16:27:41 +01:00 · 2018-10-07 16:26:05 +01:00 · 2018-10-07 16:25:51 +01:00
19 changed files with 329 additions and 124 deletions
--- a/.travis.yml
+++ b/.travis.yml
@@ -2,7 +2,7 @@ language: python
 before_install:
 - sudo apt-get update -qq
- sudo apt-get install -qq libpoppler-cpp-dev unpaper tesseract-ocr tesseract-ocr-eng
+- sudo apt-get install -qq libpoppler-cpp-dev unpaper tesseract-ocr tesseract-ocr-eng tesseract-ocr-cat
 sudo: false
--- a/11
+++ b/11
@@ -1,4 +1,4 @@
-FROM alpine:3.7
+FROM alpine:3.8
 LABEL maintainer="The Paperless Project https://github.com/danielquinn/paperless" \
      contributors="Guy Addadi <addadi@gmail.com>, Pit Kleyersburg <pitkley@googlemail.com>, \
@@ -12,11 +12,10 @@ COPY scripts/docker-entrypoint.sh /sbin/docker-entrypoint.sh
 ENV PAPERLESS_EXPORT_DIR=/export \
    PAPERLESS_CONSUMPTION_DIR=/consume
-# Install dependencies
+
-RUN apk --no-cache --update add \
+RUN apk update --no-cache && apk add python3 gnupg libmagic bash shadow curl \
-        python3 gnupg libmagic bash shadow curl \
+        sudo poppler tesseract-ocr imagemagick ghostscript unpaper optipng && \
-        sudo poppler tesseract-ocr imagemagick ghostscript unpaper && \
+    apk add --virtual .build-dependencies \
    apk --no-cache add --virtual .build-dependencies \
        python3-dev poppler-dev gcc g++ musl-dev zlib-dev jpeg-dev && \
 # Install python dependencies
    python3 -m ensurepip && \
--- a/docs/changelog.rst
+++ b/docs/changelog.rst
@@ -1,6 +1,49 @@
 Changelog
 #########
 2.5.0
 =====
 * **New dependency**: Paperless now optimises thumbnail generation with
  `optipng`_, so you'll need to install that somewhere in your PATH or declare
  its location in ``PAPERLESS_OPTIPNG_BINARY``.  The Docker image has already
  been updated on the Docker Hub, so you just need to pull the latest one from
  there if you're a Docker user.
 * "Login free" instances of Paperless were breaking whenever you tried to edit
  objects in the admin: adding/deleting tags or correspondents, or even fixing
  spelling.  This was due to the "user hack" we were applying to sessions that
  weren't using a login, as that hack user didn't have a valid id.  The fix was
  to attribute the first user id in the system to this hack user.  `#394`_
 * A problem in how we handle slug values on Tags and Correspondents required a
  few changes to how we handle this field `#393`_:
  1. Slugs are no longer editable.  They're derived from the name of the tag or
     correspondent at save time, so if you wanna change the slug, you have to
     change the name, and even then you're restricted to the rules of the
     ``slugify()`` function.  The slug value is still visible in the admin
     though.
  2. I've added a migration to go over all existing tags & correspondents and
     rewrite the ``.slug`` values to ones conforming to the ``slugify()``
     rules.
  3. The consumption process now uses the same rules as ``.save()`` in
     determining a slug and using that to check for an existing
     tag/correspondent.
 * An annoying bug in the date capture code was causing some bogus dates to be
  attached to documents, which in turn busted the UI.  Thanks to `Andrew Peng`_
  for reporting this. `#414`_.
 * A bug in the Dockerfile meant that Tesseract language files weren't being
  installed correctly.  `euri10`_ was quick to provide a fix: `#406`_, `#413`_.
 * Document consumption is now wrapped in a transaction as per an old ticket
  `#262`_.
 * The ``get_date()`` functionality of the parsers has been consolidated onto
  the ``DocumentParser`` class since much of that code was redundant anyway.
 2.4.0
 =====
@@ -525,6 +568,8 @@ bulk of the work on this big change.
 .. _ahyear: https://github.com/ahyear
 .. _jonaswinkler: https://github.com/jonaswinkler
 .. _thepill: https://github.com/thepill
 .. _Andrew Peng: https://github.com/pengc99
 .. _euri10: https://github.com/euri10
 .. _#20: https://github.com/danielquinn/paperless/issues/20
 .. _#44: https://github.com/danielquinn/paperless/issues/44
@@ -590,6 +635,7 @@ bulk of the work on this big change.
 .. _#322: https://github.com/danielquinn/paperless/pull/322
 .. _#328: https://github.com/danielquinn/paperless/pull/328
 .. _#253: https://github.com/danielquinn/paperless/issues/253
 .. _#262: https://github.com/danielquinn/paperless/issues/262
 .. _#323: https://github.com/danielquinn/paperless/issues/323
 .. _#344: https://github.com/danielquinn/paperless/pull/344
 .. _#351: https://github.com/danielquinn/paperless/pull/351
@@ -606,13 +652,19 @@ bulk of the work on this big change.
 .. _#391: https://github.com/danielquinn/paperless/pull/391
 .. _#390: https://github.com/danielquinn/paperless/pull/390
 .. _#392: https://github.com/danielquinn/paperless/issues/392
 .. _#393: https://github.com/danielquinn/paperless/issues/393
 .. _#395: https://github.com/danielquinn/paperless/pull/395
 .. _#394: https://github.com/danielquinn/paperless/issues/394
 .. _#396: https://github.com/danielquinn/paperless/pull/396
 .. _#399: https://github.com/danielquinn/paperless/pull/399
 .. _#400: https://github.com/danielquinn/paperless/pull/400
 .. _#401: https://github.com/danielquinn/paperless/pull/401
 .. _#405: https://github.com/danielquinn/paperless/pull/405
 .. _#406: https://github.com/danielquinn/paperless/issues/406
 .. _#412: https://github.com/danielquinn/paperless/issues/412
 .. _#413: https://github.com/danielquinn/paperless/pull/413
 .. _#414: https://github.com/danielquinn/paperless/issues/414
 .. _pipenv: https://docs.pipenv.org/
 .. _a new home on Docker Hub: https://hub.docker.com/r/danielquinn/paperless/
 .. _optipng: http://optipng.sourceforge.net/
--- a/paperless.conf.example
+++ b/paperless.conf.example
@@ -213,3 +213,23 @@ PAPERLESS_DEBUG="false"
 # The number of years for which a correspondent will be included in the recent
 # correspondents filter.
 #PAPERLESS_RECENT_CORRESPONDENT_YEARS=1
 ###############################################################################
 ####                     Third-Party Binaries                              ####
 ###############################################################################
 # There are a few external software packages that Paperless expects to find on
 # your system when it starts up.  Unless you've done something creative with
 # their installation, you probably won't need to edit any of these.  However,
 # if you've installed these programs somewhere where simply typing the name of
 # the program doesn't automatically execute it (ie. the program isn't in your
 # $PATH), then you'll need to specify the literal path for that program here.
 # Convert (part of the ImageMagick suite)
 #PAPERLESS_CONVERT_BINARY=/usr/bin/convert
 # Unpaper
 #PAPERLESS_UNPAPER_BINARY=/usr/bin/unpaper
 # Optipng (for optimising thumbnail sizes)
 #PAPERLESS_OPTIPNG_BINARY=/usr/bin/optipng
--- a/src/documents/admin.py
+++ b/src/documents/admin.py
@@ -125,6 +125,8 @@ class CorrespondentAdmin(CommonAdmin):
    list_filter = ("matching_algorithm",)
    list_editable = ("match", "matching_algorithm")
    readonly_fields = ("slug",)
    def get_queryset(self, request):
        qs = super(CorrespondentAdmin, self).get_queryset(request)
        qs = qs.annotate(
@@ -149,6 +151,8 @@ class TagAdmin(CommonAdmin):
    list_filter = ("colour", "matching_algorithm")
    list_editable = ("colour", "match", "matching_algorithm")
    readonly_fields = ("slug",)
    def get_queryset(self, request):
        qs = super(TagAdmin, self).get_queryset(request)
        qs = qs.annotate(document_count=models.Count("documents"))
@@ -167,7 +171,7 @@ class DocumentAdmin(CommonAdmin):
        }
    search_fields = ("correspondent__name", "title", "content", "tags__name")
-    readonly_fields = ("added",)
+    readonly_fields = ("added", "file_type", "storage_type",)
    list_display = ("title", "created", "added", "thumbnail", "correspondent",
                    "tags_")
    list_filter = (
--- a/src/documents/consumer.py
+++ b/src/documents/consumer.py
@@ -1,3 +1,4 @@
 from django.db import transaction
 import datetime
 import hashlib
 import logging
@@ -111,8 +112,11 @@ class Consumer:
                if not self.try_consume_file(file):
                    self._ignore.append((file, mtime))
    @transaction.atomic
    def try_consume_file(self, file):
-        "Return True if file was consumed"
+        """
        Return True if file was consumed
        """
        if not re.match(FileInfo.REGEXES["title"], file):
            return False
@@ -145,7 +149,7 @@ class Consumer:
        parsed_document = parser_class(doc)
        try:
-            thumbnail = parsed_document.get_thumbnail()
+            thumbnail = parsed_document.get_optimised_thumbnail()
            date = parsed_document.get_date()
            document = self._store(
                parsed_document.get_text(),
--- a/src/documents/filters.py
+++ b/src/documents/filters.py
@@ -1,4 +1,4 @@
-from django_filters.rest_framework import CharFilter, FilterSet, BooleanFilter, ModelChoiceFilter
+from django_filters.rest_framework import BooleanFilter, FilterSet
 from .models import Correspondent, Document, Tag
--- a/src/documents/migrations/0022_auto_20181007_1420.py
+++ b/src/documents/migrations/0022_auto_20181007_1420.py
@@ -0,0 +1,52 @@
 # Generated by Django 2.0.8 on 2018-10-07 14:20
 from django.db import migrations, models
 from django.utils.text import slugify
 def re_slug_all_the_things(apps, schema_editor):
    """
    Rewrite all slug values to make sure they're actually slugs before we brand
    them as uneditable.
    """
    Tag = apps.get_model("documents", "Tag")
    Correspondent = apps.get_model("documents", "Tag")
    for klass in (Tag, Correspondent):
        for instance in klass.objects.all():
            klass.objects.filter(
                pk=instance.pk
            ).update(
                slug=slugify(instance.slug)
            )
 class Migration(migrations.Migration):
    dependencies = [
        ('documents', '0021_document_storage_type'),
    ]
    operations = [
        migrations.AlterModelOptions(
            name='tag',
            options={'ordering': ('name',)},
        ),
        migrations.AlterField(
            model_name='correspondent',
            name='slug',
            field=models.SlugField(blank=True, editable=False),
        ),
        migrations.AlterField(
            model_name='document',
            name='file_type',
            field=models.CharField(choices=[('pdf', 'PDF'), ('png', 'PNG'), ('jpg', 'JPG'), ('gif', 'GIF'), ('tiff', 'TIFF'), ('txt', 'TXT'), ('csv', 'CSV'), ('md', 'MD')], editable=False, max_length=4),
        ),
        migrations.AlterField(
            model_name='tag',
            name='slug',
            field=models.SlugField(blank=True, editable=False),
        ),
        migrations.RunPython(re_slug_all_the_things, migrations.RunPython.noop)
    ]
--- a/src/documents/models.py
+++ b/src/documents/models.py
@@ -11,6 +11,7 @@ from django.conf import settings
 from django.db import models
 from django.template.defaultfilters import slugify
 from django.utils import timezone
 from django.utils.text import slugify
 from fuzzywuzzy import fuzz
 from .managers import LogManager
@@ -37,7 +38,7 @@ class MatchingModel(models.Model):
    )
    name = models.CharField(max_length=128, unique=True)
-    slug = models.SlugField(blank=True)
+    slug = models.SlugField(blank=True, editable=False)
    match = models.CharField(max_length=256, blank=True)
    matching_algorithm = models.PositiveIntegerField(
@@ -147,8 +148,6 @@ class MatchingModel(models.Model):
    def save(self, *args, **kwargs):
        self.match = self.match.lower()
        if not self.slug:
        self.slug = slugify(self.name)
        models.Model.save(self, *args, **kwargs)
@@ -452,7 +451,7 @@ class FileInfo:
        r = []
        for t in tags.split(","):
            r.append(Tag.objects.get_or_create(
-                slug=t.lower(),
+                slug=slugify(t),
                defaults={"name": t}
            )[0])
        return tuple(r)
--- a/src/documents/parsers.py
+++ b/src/documents/parsers.py
@@ -1,9 +1,13 @@
 import logging
-import shutil
+import os
 import tempfile
 import re
 import shutil
 import subprocess
 import tempfile
 import dateparser
 from django.conf import settings
 from django.utils import timezone
 # This regular expression will try to find dates in the document at
 # hand and will match the following formats:
@@ -32,6 +36,8 @@ class DocumentParser:
    """
    SCRATCH = settings.SCRATCH_DIR
    DATE_ORDER = settings.DATE_ORDER
    OPTIPNG = settings.OPTIPNG_BINARY
    def __init__(self, path):
        self.document_path = path
@@ -45,6 +51,19 @@ class DocumentParser:
        """
        raise NotImplementedError()
    def optimise_thumbnail(self, in_path):
        out_path = os.path.join(self.tempdir, "optipng.png")
        args = (self.OPTIPNG, "-o5", in_path, "-out", out_path)
        if not subprocess.Popen(args).wait() == 0:
            raise ParseError("Optipng failed at {}".format(args))
        return out_path
    def get_optimised_thumbnail(self):
        return self.optimise_thumbnail(self.get_thumbnail())
    def get_text(self):
        """
        Returns the text from the document and only the text.
@@ -55,7 +74,52 @@ class DocumentParser:
        """
        Returns the date of the document.
        """
-        raise NotImplementedError()
+
        date = None
        date_string = None
        try:
            text = self.get_text()
        except ParseError:
            return None
        next_year = timezone.now().year + 5  # Arbitrary 5 year future limit
        # Iterate through all regex matches and try to parse the date
        for m in re.finditer(DATE_REGEX, text):
            date_string = m.group(0)
            try:
                date = dateparser.parse(
                    date_string,
                    settings={
                        "DATE_ORDER": self.DATE_ORDER,
                        "PREFER_DAY_OF_MONTH": "first",
                        "RETURN_AS_TIMEZONE_AWARE": True
                    }
                )
            except TypeError:
                # Skip all matches that do not parse to a proper date
                continue
            if date is not None and next_year > date.year > 1900:
                break
            else:
                date = None
        if date is not None:
            self.log(
                "info",
                "Detected document date {} based on string {}".format(
                    date.isoformat(),
                    date_string
                )
            )
        else:
            self.log("info", "Unable to detect date for document")
        return date
    def log(self, level, message):
        getattr(self.logger, level)(message, extra={
--- a/src/paperless/checks.py
+++ b/src/paperless/checks.py
@@ -76,7 +76,12 @@ def binaries_check(app_configs, **kwargs):
    error = "Paperless can't find {}. Without it, consumption is impossible."
    hint = "Either it's not in your ${PATH} or it's not installed."
-    binaries = (settings.CONVERT_BINARY, settings.UNPAPER_BINARY, "tesseract")
+    binaries = (
        settings.CONVERT_BINARY,
        settings.OPTIPNG_BINARY,
        settings.UNPAPER_BINARY,
        "tesseract"
    )
    check_messages = []
    for binary in binaries:
--- a/src/paperless/models.py
+++ b/src/paperless/models.py
@@ -1,15 +1,20 @@
 from django.contrib.auth.models import User as DjangoUser
 class User:
    """
    This is a dummy django User used with our middleware to disable
    login authentication if that is configured in paperless.conf
    """
    is_superuser = True
    is_active = True
    is_staff = True
    is_authenticated = True
-    # Must be -1 to avoid colliding with real user ID's (which start at 1)
+    @property
-    id = -1
+    def id(self):
        return DjangoUser.objects.order_by("pk").first().pk
    @property
    def pk(self):
@@ -18,8 +23,8 @@ class User:
 """
 NOTE: These are here as a hack instead of being in the User definition
-  above due to the way pycodestyle handles lamdbdas.
+NOTE: above due to the way pycodestyle handles lamdbdas.
-  See https://github.com/PyCQA/pycodestyle/issues/379 for more.
+NOTE: See https://github.com/PyCQA/pycodestyle/issues/379 for more.
 """
 User.has_module_perms = lambda *_: True
--- a/src/paperless/settings.py
+++ b/src/paperless/settings.py
@@ -247,6 +247,9 @@ CONVERT_TMPDIR = os.getenv("PAPERLESS_CONVERT_TMPDIR")
 CONVERT_MEMORY_LIMIT = os.getenv("PAPERLESS_CONVERT_MEMORY_LIMIT")
 CONVERT_DENSITY = os.getenv("PAPERLESS_CONVERT_DENSITY")
 # OptiPNG
 OPTIPNG_BINARY = os.getenv("PAPERLESS_OPTIPNG_BINARY", "optipng")
 # Unpaper
 UNPAPER_BINARY = os.getenv("PAPERLESS_UNPAPER_BINARY", "unpaper")
--- a/src/paperless/version.py
+++ b/src/paperless/version.py
@@ -1 +1 @@
-__version__ = (2, 3, 0)
+__version__ = (2, 5, 0)
--- a/src/paperless_tesseract/parsers.py
+++ b/src/paperless_tesseract/parsers.py
@@ -4,7 +4,6 @@ import re
 import subprocess
 from multiprocessing.pool import Pool
 import dateparser
 import langdetect
 import pyocr
 from django.conf import settings
@@ -14,7 +13,7 @@ from pyocr.libtesseract.tesseract_raw import \
 from pyocr.tesseract import TesseractError
 import pdftotext
-from documents.parsers import DocumentParser, ParseError, DATE_REGEX
+from documents.parsers import DocumentParser, ParseError
 from .languages import ISO639
@@ -33,7 +32,6 @@ class RasterisedDocumentParser(DocumentParser):
    DENSITY = settings.CONVERT_DENSITY if settings.CONVERT_DENSITY else 300
    THREADS = int(settings.OCR_THREADS) if settings.OCR_THREADS else None
    UNPAPER = settings.UNPAPER_BINARY
    DATE_ORDER = settings.DATE_ORDER
    DEFAULT_OCR_LANGUAGE = settings.OCR_LANGUAGE
    OCR_ALWAYS = settings.OCR_ALWAYS
@@ -46,15 +44,18 @@ class RasterisedDocumentParser(DocumentParser):
        The thumbnail of a PDF is just a 500px wide image of the first page.
        """
        out_path = os.path.join(self.tempdir, "convert.png")
        # Run convert to get a decent thumbnail
        run_convert(
            self.CONVERT,
            "-scale", "500x5000",
            "-alpha", "remove",
            "{}[0]".format(self.document_path),
-            os.path.join(self.tempdir, "convert.png")
+            out_path
        )
-        return os.path.join(self.tempdir, "convert.png")
+        return out_path
    def _is_ocred(self):
@@ -202,40 +203,6 @@ class RasterisedDocumentParser(DocumentParser):
        text += self._ocr(imgs[middle + 1:], self.DEFAULT_OCR_LANGUAGE)
        return text
    def get_date(self):
        date = None
        datestring = None
        try:
            text = self.get_text()
        except ParseError as e:
            return None
        # Iterate through all regex matches and try to parse the date
        for m in re.finditer(DATE_REGEX, text):
            datestring = m.group(0)
            try:
                date = dateparser.parse(
                           datestring,
                           settings={'DATE_ORDER': self.DATE_ORDER,
                                     'PREFER_DAY_OF_MONTH': 'first',
                                     'RETURN_AS_TIMEZONE_AWARE': True})
            except TypeError:
                # Skip all matches that do not parse to a proper date
                continue
            if date is not None:
                break
        if date is not None:
            self.log("info", "Detected document date " + date.isoformat() +
                             " based on string " + datestring)
        else:
            self.log("info", "Unable to detect date for document")
        return date
 def run_convert(*args):
--- a/src/paperless_tesseract/tests/test_date.py
+++ b/src/paperless_tesseract/tests/test_date.py
@@ -384,3 +384,42 @@ class TestDate(TestCase):
            document.get_date(),
            datetime.datetime(2017, 12, 31, 0, 0, tzinfo=tz.tzutc())
        )
    @mock.patch(
        "paperless_tesseract.parsers.RasterisedDocumentParser.get_text",
        return_value="01-07-0590 00:00:00"
    )
    @mock.patch(
        "paperless_tesseract.parsers.RasterisedDocumentParser.SCRATCH",
        SCRATCH
    )
    def test_crazy_date_past(self, *args):
        document = RasterisedDocumentParser("/dev/null")
        document.get_text()
        self.assertIsNone(document.get_date())
    @mock.patch(
        "paperless_tesseract.parsers.RasterisedDocumentParser.get_text",
        return_value="01-07-2350 00:00:00"
    )
    @mock.patch(
        "paperless_tesseract.parsers.RasterisedDocumentParser.SCRATCH",
        SCRATCH
    )
    def test_crazy_date_future(self, *args):
        document = RasterisedDocumentParser("/dev/null")
        document.get_text()
        self.assertIsNone(document.get_date())
    @mock.patch(
        "paperless_tesseract.parsers.RasterisedDocumentParser.get_text",
        return_value="01-07-0590 00:00:00"
    )
    @mock.patch(
        "paperless_tesseract.parsers.RasterisedDocumentParser.SCRATCH",
        SCRATCH
    )
    def test_crazy_date_past(self, *args):
        document = RasterisedDocumentParser("/dev/null")
        document.get_text()
        self.assertIsNone(document.get_date())
--- a/src/paperless_text/parsers.py
+++ b/src/paperless_text/parsers.py
@@ -1,11 +1,9 @@
 import os
 import re
 import subprocess
 import dateparser
 from django.conf import settings
-from documents.parsers import DocumentParser, ParseError, DATE_REGEX
+from documents.parsers import DocumentParser, ParseError
 class TextDocumentParser(DocumentParser):
@@ -16,7 +14,6 @@ class TextDocumentParser(DocumentParser):
    CONVERT = settings.CONVERT_BINARY
    THREADS = int(settings.OCR_THREADS) if settings.OCR_THREADS else None
    UNPAPER = settings.UNPAPER_BINARY
    DATE_ORDER = settings.DATE_ORDER
    DEFAULT_OCR_LANGUAGE = settings.OCR_LANGUAGE
    OCR_ALWAYS = settings.OCR_ALWAYS
@@ -26,7 +23,7 @@ class TextDocumentParser(DocumentParser):
    def get_thumbnail(self):
        """
-        The thumbnail of a txt is just a 500px wide image of the text
+        The thumbnail of a text file is just a 500px wide image of the text
        rendered onto a letter-sized page.
        """
        # The below is heavily cribbed from https://askubuntu.com/a/590951
@@ -35,7 +32,7 @@ class TextDocumentParser(DocumentParser):
        text_color = "black"  # text color
        psize = [500, 647]  # icon size
        n_lines = 50  # number of lines to show
-        output_file = os.path.join(self.tempdir, "convert-txt.png")
+        out_path = os.path.join(self.tempdir, "convert.png")
        temp_bg = os.path.join(self.tempdir, "bg.png")
        temp_txlayer = os.path.join(self.tempdir, "tx.png")
@@ -46,9 +43,13 @@ class TextDocumentParser(DocumentParser):
            work_size = ",".join([str(n - 1) for n in psize])
            r = str(round(psize[0] / 10))
            rounded = ",".join([r, r])
-            run_command(self.CONVERT, "-size ", picsize, ' xc:none -draw ',
+            run_command(
-                        '"fill ', bg_color, ' roundrectangle 0,0,',
+                self.CONVERT,
-                        work_size, ",", rounded, '" ', temp_bg)
+                "-size ", picsize,
                ' xc:none -draw ',
                '"fill ', bg_color, ' roundrectangle 0,0,', work_size, ",", rounded, '" ',  # NOQA: E501
                temp_bg
            )
        def read_text():
            with open(self.document_path, 'r') as src:
@@ -57,7 +58,8 @@ class TextDocumentParser(DocumentParser):
                return text.replace('"', "'")
        def create_txlayer():
-            run_command(self.CONVERT,
+            run_command(
                self.CONVERT,
                "-background none",
                "-fill",
                text_color,
@@ -65,14 +67,20 @@ class TextDocumentParser(DocumentParser):
                "-border 4 -bordercolor none",
                "-size ", txsize,
                ' caption:"', read_text(), '" ',
-                        temp_txlayer)
+                temp_txlayer
            )
        create_txlayer()
        create_bg()
-        run_command(self.CONVERT, temp_bg, temp_txlayer,
+        run_command(
-                    "-background None -layers merge ", output_file)
+            self.CONVERT,
            temp_bg,
            temp_txlayer,
            "-background None -layers merge ",
            out_path
        )
-        return output_file
+        return out_path
    def get_text(self):
@@ -84,40 +92,6 @@ class TextDocumentParser(DocumentParser):
        return self._text
    def get_date(self):
        date = None
        datestring = None
        try:
            text = self.get_text()
        except ParseError as e:
            return None
        # Iterate through all regex matches and try to parse the date
        for m in re.finditer(DATE_REGEX, text):
            datestring = m.group(0)
            try:
                date = dateparser.parse(
                           datestring,
                           settings={'DATE_ORDER': self.DATE_ORDER,
                                     'PREFER_DAY_OF_MONTH': 'first',
                                     'RETURN_AS_TIMEZONE_AWARE': True})
            except TypeError:
                # Skip all matches that do not parse to a proper date
                continue
            if date is not None:
                break
        if date is not None:
            self.log("info", "Detected document date " + date.isoformat() +
                             " based on string " + datestring)
        else:
            self.log("info", "Unable to detect date for document")
        return date
 def run_command(*args):
    environment = os.environ.copy()
--- a/src/reminders/migrations/0002_auto_20181007_1420.py
+++ b/src/reminders/migrations/0002_auto_20181007_1420.py
@@ -0,0 +1,19 @@
 # Generated by Django 2.0.8 on 2018-10-07 14:20
 from django.db import migrations, models
 import django.db.models.deletion
 class Migration(migrations.Migration):
    dependencies = [
        ('reminders', '0001_initial'),
    ]
    operations = [
        migrations.AlterField(
            model_name='reminder',
            name='document',
            field=models.ForeignKey(on_delete=django.db.models.deletion.PROTECT, to='documents.Document'),
        ),
    ]
--- a/src/reminders/models.py
+++ b/src/reminders/models.py
@@ -4,7 +4,6 @@ from django.db import models
 class Reminder(models.Model):
    document = models.ForeignKey(
-        "documents.Document", on_delete=models.PROTECT
+        "documents.Document", on_delete=models.PROTECT)
        )
    date = models.DateTimeField()
    note = models.TextField(blank=True)
Author	SHA1	Message	Date
Daniel Quinn	2ef2bf873e	Version bump: 2.5.0	2018-10-07 16:30:36 +01:00
Daniel Quinn	0bb7d27269	pep8	2018-10-07 16:30:02 +01:00
Daniel Quinn	ce5e8b2658	Rework user hack for "login-free" sessions #394	2018-10-07 16:27:41 +01:00
Daniel Quinn	3f572afb8b	Add a little more read-only info for documents	2018-10-07 16:26:05 +01:00
Daniel Quinn	5c3cb1e4ab	Rework how slugs are generated/referenced #393	2018-10-07 16:25:51 +01:00
Daniel Quinn	c7f4bfe4f3	Add migration that should have come in some time ago	2018-10-07 16:23:03 +01:00
Daniel Quinn	65d6599964	Fix formatting	2018-10-07 16:22:52 +01:00
Daniel Quinn	5d32e89c44	Wrap each document consumption in a transaction	2018-10-07 14:56:56 +01:00
Daniel Quinn	750ab5bf85	Use optipng to optimise document thumbnails	2018-10-07 14:56:38 +01:00
Daniel Quinn	2a3f766b93	Consolidate get_date onto the DocumentParser parent class	2018-10-07 14:56:02 +01:00
Daniel Quinn	14bb52b6a4	Wrap document consumption in a transaction #262	2018-10-07 13:12:22 +01:00
Daniel Quinn	b5176d207e	Hopefully fix Travis	2018-10-01 20:40:43 +01:00
Daniel Quinn	e4044d0df9	Update version number & changelog	2018-10-01 20:40:32 +01:00
Daniel Quinn	bacdd51fd7	Merge pull request #413 from euri10/master Fix issue where tesseract langages weren't installed properly	2018-10-01 19:40:04 +00:00
Daniel Quinn	8010d72f18	Tweak the date guesser to not allow dates prior to 1900 (#414 )	2018-10-01 20:03:47 +01:00
euri10	9dd76f1b87	Fix issue where tesseract langages weren't installed properly	2018-09-24 13:30:10 +02:00
`@@ -1,4 +1,4 @@`
	`from django_filters.rest_framework import CharFilter, FilterSet, BooleanFilter, ModelChoiceFilter`	`from django_filters.rest_framework import BooleanFilter, FilterSet`

	`from .models import Correspondent, Document, Tag`	`from .models import Correspondent, Document, Tag`
`@@ -1 +1 @@`
	`__version__ = (2, 3, 0)`	`__version__ = (2, 5, 0)`