Speeding up Django unit tests with SQLite, reuse-db and RAMDisk

Recently, Harry wrote a post about how to speed up Django unit-testing by using a persistent in-memory SQLite database.

While Harry uses the default Django testrunner and a Linux machine, I use pytest with pytest-django on a Mac. This changes a few things, and this post will show the differences. So it's very specific to:

  • Django
  • pytest
  • macOS

Create and mount the RAMDisk

/dev/shm is a Linux thing. But Harry already provided the link to a working solution on Stack Overflow to how to do this on macOS.

$ hdiutil attach -nomount ram://$((2 * 1024 * SIZE_IN_MB))
/dev/disk2

$ diskutil eraseVolume HFS+ RAMDisk /dev/disk2

This creates an in-memory Volume that can store the persistent SQLite database.

Use SQLite

Maybe I'm doing this all wrong, but because I use pytest, I configure the test database in a separate configuration file like this.

# config/settings/testing.py
from .common import *

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3',
        'NAME': ':memory:',
        'TEST': {}
    }
}

Use it in-memory

What --keepdb is for the default testrunner, --reuse-db is for pytest-django. I usually set this as the default for all test runs in pytest.ini.

# pytest.ini
[pytest]
DJANGO_SETTINGS_MODULE=config.settings.testing
addopts = --reuse-db

To not re-use the database, pytest-django's argument is --create-db. So Harry's

if 'keepdb' in sys.argv:

becomes

if not 'create-db' in sys.argv:

in my case. To use the RAMDisk:

# config/settings/testing.py

[...]

import sys
if not 'create-db' in sys.argv:
    # and this allows you to use --reuse-db to skip re-creating the db,
    # even faster!
    DATABASES['default']['TEST']['NAME'] = '/Volumes/RAMDisk/myfunnyproject.test.db.sqlite3'

When there's no RAMDisk

If there is no RAMDisk, for example after a reboot, then the test run fails as soon as it hits the database. Obviously.

>       conn = Database.connect(**conn_params)
E       django.db.utils.OperationalError: unable to open database file

I just throw another if statement in there to check for the existence of the RAMDisk.

if os.path.isdir('/Volumes/RAMDisk') and not 'create-db' in sys.argv:

Summary

You can check a full test configuration here:

https://github.com/FlowFX/unkenmathe.de/blob/master/src/config/settings/testing.py

Test-driven Python learning

Bugs are nice because I can learn from them. Recently I found a bug that taught me about floats in Python and how not to use them. I found it thanks to an automated test.

The test

In a legal contract I had to print a monetary amount in a format similar to

MXN $11,600.00 (eleven thousand, six hundred pesos 00/100 cents)

My tests check the LaTeX source of the resulting PDF, looking for the correctly formatted string 11,600.00, which represents an amount of 11,000 something plus a VAT of 16%, and also for the string eleven thousand, six hundred.

It essentially goes like this:

from app.models import LeaseAgreement

def test_contract_states_rent_plus_vat():
    """Test correct statement of rent amount with and without IVA (VAT (USt (MWSt)))."""

    contract = LeaseAgreement.build(    
        rent_amount=Decimal(10000.00)
    )

    tex = contract.render_tex()

    assert '10,000.00' in tex
    assert '11,600.00' in tex
    assert 'eleven thousand, six hundred' in tex

The first two asserts did not fail, the third one did. My code actually printed

MXN $11,600.00 (eleven thousand, five hundred and ninety-nine pesos 00/100 cents)

The bug

Turns out I had misused Python's Decimal class to calculate the amount plus VAT. Like this:

>>> x = Decimal(10000) * Decimal(1.16)

Printing the number in the desired format works fine.

>>> print('{:,.2f}'.format(x))
11,600.00

But rounding down for only printing the amount without decimal places does not.

>>> num2words(int(x))
'eleven thousand, five hundred and ninety-nine'

That is because floats are not exact.

>>> Decimal(1.16)
Decimal('1.1599999999999999200639422269887290894985198974609375')
>>> int(Decimal(1.16)*10000)
11599

The fix

Working with monetary amounts that round to 2 digits requires care. Thanks to Rami and others, I now know that I should have used e.g.

Decimal('1.16')

and

Decimal('1.16').quantize(Decimal('.01'), rounding=ROUND_HALF_UP)

when apropriate.

Test Django with Selenium, pytest and user authentication

When testing a Django app with Selenium, how do you authenticate the user and test pages that require to be logged in?

Of course: StackOverflow has the answer.

The following is the actual code that I use to make this work with pytest. It requires pytest-django.

pytest fixtures

In pytest everything is contained in neat test fixtures.

Browser

The first fixture provides a broser/webdriver instance with an anonymous user. The default way of using Selenium.

# system_tests/conftest.py

from selenium import webdriver

@pytest.fixture(scope='module')
def browser(request):
    """Provide a selenium webdriver instance."""
    # SetUp
    options = webdriver.ChromeOptions()
    options.add_argument('headless')

    browser_ = webdriver.Chrome(chrome_options=options)

    yield browser_

    # TearDown
    browser_.quit()

User

To be able to authenticate, I need a user in the database. Using Factory Boy:

# system_tests/conftest.py

from django.contrib.auth.hashers import make_password
from accounts.factories import UserFactory

TESTEMAIL = 'test-user@example.com'
TESTPASSWORD = 'a-super-secret-password'

@pytest.fixture()
def user(db):
    """Add a test user to the database."""
    user_ = UserFactory.create(
        name='I am a test user',
        email=TESTEMAIL,
        password=make_password(TESTPASSWORD),
    )

    return user_

Authenticated browser

To get the authenticated browser, the first two fixtures are required, plus the Django TestClient and LiveServer fixtures which, for pytest, are provided by pytest-django. Using the code from SO:

# system_tests/conftest.py

@pytest.fixture()
def authenticated_browser(browser, client, live_server, user):
    """Return a browser instance with logged-in user session."""
    client.login(email=TESTEMAIL, password=TESTPASSWORD)
    cookie = client.cookies['sessionid']

    browser.get(live_server.url)
    browser.add_cookie({'name': 'sessionid', 'value': cookie.value, 'secure': False, 'path': '/'})
    browser.refresh()

    return browser

The tests

Now the Selenium test can use the authenticated_browser fixture.

# system_tests/test_django.py

def test_django_with_authenticated_user(live_server, authenticated_browser):
    """A Selenium test."""
    browser = authenticated_browser

    # Open the home page
    browser.get(live_server.url)
    

To test logging in and out of the app, I use the unauthenticated browser instance plus the user fixture.

# system_tests/test_django.py

def test_login_of_anonymous_user(live_server, browser, user):
    # Open the home page
    browser.get(live_server.url)

    # Click the 'login' button
    browser.find_element_by_id('id_link_to_login')).click()
    

Summary

These are the pytest fixtures that I use to test a Django app with an authenticated user.

If there is a better way to do this, please tell me! I want to know.

pytest parameter matrices

A few months ago I explained how I efficiently test Django forms with pytest parameterization. Last week, I learned a new trick from Raphael Pierzina's post about ids for fixtures and parametrize, which is:

You you can add multiple parametrization markers to a test function which then create a test parameter matrix.

The list of test cases can thus be written much more clearly. Compare the example code from my previous post:

from django import forms

import pytest


class ExampleForm(forms.Form):
    name = forms.CharField(required=True)
    age = forms.IntegerField(min_value=18)


@pytest.mark.parametrize(
    'name, age, validity',
    [('Hugo', 18, True),
     ('Egon', 17, False),
     ('Balder', None, False),
     ('', 18, False),
     (None, 18, False),
     ])

def test_example_form(name, age, validity):
    form = ExampleForm(data={
        'name': name,
        'age': age,
    })

    assert form.is_valid() is validity

to the same test using multiple parameterization markers:

@pytest.mark.parametrize(
    'name, valid_name',
    [
        ('Hugo', True),
        ('', False),
        (None, False),
    ]
)
@pytest.mark.parametrize(
    'age, valid_age',
    [
        ('18', True),
        ('17', False),
        (None, False),
    ]
 )
def test_example_form(name, age, valid_name, valid_age):
    form = ExampleForm(data={
        'name': name,
        'age': age,
    })

    assert form.is_valid() is (valid_name and valid_age)

This tests all 9 possible combinations of the three test cases each for name and age. Maybe that's overkill, but on the other hand I can be sure that I touch all the relevant combinations.

Take note of the last line that checks that the form only validates when both input parameters are valid.

The main advantage I see is legibility. For each form field, I only have to understand the list of test parameters for exactly that field, and not any additional combinations with other fields.

I'm loving it!

Continuous deployment of a Django app from Travis CI to PythonAnywhere

This post describes the configuration of a continuous deployment pipeline that deploys a Django project from GitHub via Travis CI to PythonAnywhere.

All code samples come from a pet project of mine: Unkenmathe (GitHub repository).

Please note that this is no introduction to Travis CI, PythonAnywhere nor Git.

Here are the steps that I take.

1. Deploy Django project

PythonAnywhere's guide for Deploying an existing Django project on PythonAnywhere explains everything to manually set up the web app.

For reference, the Unkenmathe code is checked out to

/var/www/sites/unkenmathe.de

and the virtual environment lives at

~/.virtualenvs/unkenmathe.de/

2. Prepare Git push deployment

PythonAnywhere has a comprehensive guide to set up Git push deployments.

My bare repository is located at

~/bare-repos/unkenmathe.git

The post-receive hook looks like this:

# ~/bare-repos/unkenmathe.git/hooks/post-receive
#!/bin/bash

BASE_DIR=/var/www/sites/unkenmathe.de
PYTHON=$HOME/.virtualenvs/unkenmathe.de/bin/python
PIP=$HOME/.virtualenvs/unkenmathe.de/bin/pip
MANAGE=$BASE_DIR/manage.py

echo "=== configure Django ==="
export DJANGO_SETTINGS_MODULE=config.settings.production

echo "=== create base directory ==="
mkdir -p $BASE_DIR

echo "=== checkout new code ==="
GIT_WORK_TREE=$BASE_DIR git checkout -f

echo "=== install dependencies in virtual environment ==="
$PIP install -q -r $BASE_DIR/requirements/production.txt

echo "=== collect static files ==="
$PYTHON $MANAGE collectstatic --no-input

echo "=== update database ==="
$PYTHON $MANAGE migrate --no-input

3. Custom deployment with Travis CI

I set up the repository in Travis CI for automatic builds on pull requests and branch pushes. In order to deploy to PythonAnywhere, I use Travis's Custom deployment.

All Travis related files live in the .travis subdirectory of the Django project. This is of course completely arbitrary.

~ $ cd ~/code/unkenmathe/
unkenmathe $ mkdir .travis
unkenmathe $ cd .travis

Create SSH keys

git push uses SSH, so I need a pair of SSH keys.

.travis $ ssh-keygen -t rsa -b 4096 -C 'hallo@example.com' -f deploy_key

Copy the public key to the PythonAnywhere account (see PythonAnywhere: SSH access).

.travis $ ssh-copy-id -i deploy_key flowfx@ssh.pythonanywhere.com

Encrypt SSH key and add it to the repository

Travis offers a tool to encrypt files that allows to add the SSH private key to the Git repository. See Encrypting files for a complete how-to.

First, I encrypt the deploy key,

.travis $ travis login
.travis $ travis encrypt-file deploy_key --add

then add it to the Git repository.

.travis $ git add deploy_key.enc

Last, I make sure the decrypted key is never pushed to the public GitHub repository:

unkenmathe $ echo 'deploy_key' >> .gitignore

Configure Travis CI

A simplified .travis.yml configuration file (here the one used for Unkenmathe) looks like this. The before_install part is added automatically by the travis encrypt-file deploy_key --add command. The ssh_known_hosts line is also required for push deployment with Git/SSH.

Hopefully, the rest is documented sufficiently by the comments.

# .travis.yml
language: python
cache: pip
python:
- 3.6
addons:
  # add PythonAnywhere server to known hosts
  ssh_known_hosts: ssh.pythonanywhere.com
before_install:
  # decrypt ssh private key
  - openssl aes-256-cbc -K $encrypted_xxxxxxxxxxxx_key -iv $encrypted_xxxxxxxxxxxx_iv -in .travis/deploy_key.enc -out deploy_key -d
install: pip install -r requirements/testing.txt
script:
  # run test suite
  - pytest --cov
after_success:
  # start ssh agent and add private key
  - eval "$(ssh-agent -s)"
  - chmod 600 deploy_key
  - ssh-add deploy_key
  # configure remote repository
  - git remote add pythonanywhere flowfx@ssh.pythonanywhere.com:/home/flowfx/bare-repos/unkenmathe.git
  # push master branch to production 
  - git push -f pythonanywhere master
  # reload PythonAnywhere web app via the API
  - python .travis/reload-webapp.py
after_deploy:
  # update coveralls.io
  - coveralls
notifications:
  # spare me from email notifications
  email: false

Reload web app

The after_success step includes a call to .travis/reload-webapp.py, which is a Python script that reloads the web app via the PythonAnywhere API. This is more or less copied directly from the documentation.

# .travis/reload-webapp.py
"""Script to reload the web app via the PythonAnywhere API.

"""
import os
import requests

my_domain = os.environ['PYTHONANYWHERE_DOMAIN']
username = os.environ['PYTHONANYWHERE_USERNAME']
token = os.environ['PYTHONANYWHERE_API_TOKEN']

response = requests.post(
    'https://www.pythonanywhere.com/api/v0/user/{username}/webapps/{domain}/reload/'.format(
        username=username, domain=my_domain
    ),
    headers={'Authorization': 'Token {token}'.format(token=token)}
)
if response.status_code == 200:
    print('All OK')
else:
    print('Got unexpected status code {}: {!r}'.format(response.status_code, response.content))

Set environment variables

To make all this actually work, you need to set some environment variables in the Travis project settings. Namely PYTHONANYWHERE_DOMAIN, PYTHONANYWHERE_USERNAME and PYTHONANYWHERE_API_TOKEN.

Also, don't forget to set DJANGO_SECRET_KEY!

Summary

These are the resources you need:

PythonAnywhere

Travis CI

Future

I need to look into Travis's Script deployment which looks like a much cleaner way to run the deployment commands.

Comment!

If you find the one error that I missed, please tell me about it!

Updates

  • 5/9/2017: used Unkenmathe as example project, formatting.

Use Django in-memory file storage with pytest

In my current project, I create PDF files from Jinja2/LaTeX templates. In each test run, several PDFs are created and saved to disk. How do you test this without filling up the hard drive?

I use an in-memory data storage. For Django there is a package that makes it really easy: dj-inmemorystorage.

A non-persistent in-memory data storage backend for Django.

Using pytest fixtures:

# tests/conftest.py
import pytest
import inmemorystorage

from django.conf import settings

@pytest.fixture
def in_memory():
    settings.DEFAULT_FILE_STORAGE = 'inmemorystorage.InMemoryStorage'

That's it. When using this in_memory fixture in a test function, the files will never be written on disk.

Update 5/9/2017

It's actually much easier than this. I now configure the in-memory file storage directly in the Django configuration file that pytest uses.

# config/settings/testing.py
"""Django configuration for testing and CI environments."""
from .common import *

# Use in-memory file storage
DEFAULT_FILE_STORAGE = 'inmemorystorage.InMemoryStorage'

Neue Podcastfolgen Juli 2017

Nach langen Monaten habe ich diese Woche gleich zwei neue Podcastepisoden veröffentlicht.

Mexiko

Schon im Januar hatte ich die zweite Folge von Tacos und Limetten aufgenommen. Es wurden zweieinhalb spannende Stunden über die Geschichte und Gesellschaft Mexikos. Außerdem gibt's Reisetipps!

C3S

Und auch der C3S-Podcast ist wieder am Start. In der neuen Episode berichtet mir m.eik von der letzten Generalversammlung und den Wirren der deutschen Gesetzgebung in Sachen Verwertungsgesellschaften.

Install an extra LaTeX font package on PythonAnywhere

This week I installed a LaTeX font package in my PythonAnywhere account. The TL;DR: just install it manually.


I've been using LaTeX on and off for more than 10 years now. But I never dove into the depths of the system, typesetting mathematical expressions was fun enough for me.

So I rely on Google a good bit. But that's fine, as there are always new things to discover. Plus: the PythonAnywere support is superb! I had a very helpful email exchange with Giles. Thanks!

Using the TeXLive distribution on my Mac, and on the Ubuntu server that runs PythonAnywhere, the way to install packages nowadays is to use the tlmgr command, which I can only assume to be an abbreviation for TeX Live manager.

$ tlmgr install roboto

Well, that didn't work, because I am not root nor sudo. Turns out, there is a tlmgr User Mode which, after initializing a local user tree

$ tlmgr init-usertree

allows me to install additional LaTeX packages into my home directory under ~/texmf. So:

$ tlmgr --usermode install roboto

should have done it. Unfortunately I got the following error message.

$ tlmgr --usermode install roboto
/usr/bin/tlmgr: could not find a usable xzdec.
/usr/bin/tlmgr: Please install xzdec and try again.

Thanks to Giles, it was easy to install the missing xz package.

$ wget https://tukaani.org/xz/xz-5.2.3.tar.gz
$ tar xf xz-5.2.3.tar.gz 
$ cd xz-5.2.3/
$ ./configure --prefix=$HOME/.local
$ make
$ make install

That actually worked. But then tlmgr spit out this:

Unknown directive ...containerchecksum
06c8c1fff8b025f6f55f8629af6e41a6dd695e13bbdfe8b78b678e9cb0cfa509826355f4ece20d8a99b49bcee3c5931b8d766f0fc3dae0d6a645303d487600b0..., please fix it! at /usr/share/texlive/tlpkg/TeXLive/TLPOBJ.pm line 210, <$retfh> line 5761.

This is a clear case for Google, and Google didn't disappoint. The installed version of Tex Live is from 2013 and doesn't work with the current package repositories.

$ tlmgr --version
(running on Debian, switching to user mode!)
tlmgr revision 32912 (2014-02-08 00:49:53 +0100)
tlmgr using installation: /usr/share/texlive
TeX Live (http://tug.org/texlive) version 2013

After setting the repository to an old archived one,

$ tlmgr option repository ftp://tug.org/historic/systems/texlive/2013/tlnet-final

the TeX Live installation was happy.

But, it turns out, the 2013 repository doesn't even include the roboto font package. So…

TL;DR

… in the end, I installed the font by hand, following the instructions here: http://www.ctan.org/tex-archive/fonts/roboto/.

$ wget http://mirror.ctan.org/install/fonts/roboto.tds.zip
$ cd ~/texmf
$ unzip ~/roboto.tds.zip
$ texhash
$ updmap --enable Map=roboto.map

And now I have the roboto font available for pdflatex on PythonAnywhere!