Testing

The goal of this article is to provide a general overview of the broader concept of testing and go through the main tools that are necessary to make quality code and contribute effectively to projects.

What are/why tests?#

In order to guarantee that your code works as expected for a given environments(s), you might already have been testing intuitively your code manually while coding by running it and comparing results with what you expect. This is indeed the whole idea behind (effective) code testing except that it has to:

Be reproducible and automatic by providing sets of inputs and outputs that should match the code execution result and triggered by some orchestrator.
Cover all the code.
consider all the edge cases and exceptions that can occur.
Be granular enough to respect the Single-responsibility principle.
check the effective interaction between components as equally important as effectiveness of the components themselves.

Unit vs integration tests#

Unit tests are meant to check a single component/functionality of the code, so in theory a test of a code making DB/HTTP call..., shouldn't be called a unit test, but due to habit, people might be calling it as such.
Integration tests check the good interaction between different components of the application.

Test Runners#

There are many Python tools that can run tests and orchestrate the running of a series of tests: Unittest, Nose, Pytest... this latter though remains currently the one that we use in our projects the most, so we assume Pytest is the tool used for the rest of this document.

Structure#

To create a test you need to decide what functionality(ies) you need to test, then you create the test inputs and execute the code in question and finally compare the outputs or side effects with the expected results.

By internal convention, the tests tree lives in a single top package called tests which contains subpackages for each of the project's applications (ex: test_profile to test profile) which in their turn have modules for each app module (ex: test_models.py, `test_api/test_login.py).

All the test tree elements should be prefixed by test_.

Marks#

To mark tests (for categorization or other purposes) Pytest provides a decorator @pytest.mark.AMARKER (where AMARKER could be anything). To run only a category of tests you can use pytest -m AMARKER. Here follows some predefined markers and their roles:

@pytest.mark.skip and @pytest.mark.skipif#

To skip a test.

@pytest.mark.xfail#

When a test is expected to fail, the whole test suite won't fail for that tests as a result.

@pytest.mark.parametrize#

To run multiple variants of a test with different args or inputs.

Conftest.py#

Conftest.py files are Pytest configuration files that are loaded within the tests directory structure (directories and all sub-directories), they are used to :

Load fixtures (see Fixtures section below).¹
Loading plugins.
Specify hooks like setup and teardown methods.²

Fixtures#

Pytest fixtures are a good (if not the best) solution to not repeat chunks of test codes across the project by providing data or state setup/teardown for tests.
Fixtures are defined using the @pytest.fixture decorator, and can have one of the following scopes:

function: the default scope, the fixture is destroyed at the end of the test.
class: the fixture is destroyed during teardown of the last test in the class.
module: the fixture is destroyed during teardown of the last test in the module.
package: the fixture is destroyed during teardown of the last test in the package.
session: the fixture is destroyed at the end of the test session.^[3](#fn3)

Conftest.py is commonly used to define fixtures. However, defining fixtures in the root conftest.py might slow down testing if such fixtures are not used by all tests.

Parametrization#

(See @pytest.mark.parametrize sub-section above.)

Parallelization#

Tests are run sequentially by default, to run them in parallel a plugin comes to the rescue: pytest-xdist, you can set the number of workers by pytest -n 4 (4 workers as an example) or let Pytest pick the number automatically by pytest -n auto, it will decide then based on the number CPU cores/threads the machine has.

Mock/patch#

Mocking objects consists on imitating real objects within a testing environment in order to have more control over tested code, improve tests and make them easier.

Cases where you might need mocking/patching:

Network calls: instead of actually making a real call, we can easily mock it with the result we expect.
Check the number of times a function is called and the args/kwargs used in the calls.
More generally: enforce a return value or side effect (behavior) of given code. ^[4](#fn4)

Test exceptions#

In order to write assertions about raised exceptions, you can use pytest.raises() as a context manager ...⁵

Performance testing#

During tests, performance (degradation) could be tested using timeit module provided by the standard library, but there are other fancier tools too like pytest-benchmark ⁶

Tests with HTTP calls#

You can record HTTP requests and replay them in tests by using VCR.py ⁷ through the Pytest-vcr plugin ⁸.

Code quality/security testing#

For finding security vulnerabilities in your code, you can use one of the security scanners like Bandit ⁹, which is checking for common vulnerabilities. Some of our projects rely on SonarQube ¹⁰, which makes code quality checks alongside security checks.

Test coverage#

To produce coverage reports to reflect how much tests cover your code, you can use Pytest-cov by running pytest --cov ¹¹.

Test driven coding#

Test-driven development (TDD) is a software development process relying on software requirements being converted to test cases before software is fully developed...¹²

We don't usually go this far for the whole projects, but for bug fixing, creating a test to demonstrate the issue and coding a solution afterwards so that the test succeeds might be one of the best (and most reliable) approaches, it can help also for explaining to other developers by showing them concretely the issue in question (complementary to verbal description or even better sometimes).

Linters#

Linting might be considered as a kind of testing, since the linters analyze code to detect code Lint defects such as code errors, code not respecting Python coding conventions like PEP8 ¹³ and PEP257 ¹⁴ ... There are many Linters (both logical and stylistic): Pylint, PyFlakes, Mypy, Black, Isort... In most of our projects, we use a Pre-commit configuration ¹⁵ to run linting, here is an example:

default_language_version:  python: python3.8repos:  - repo: https://github.com/pre-commit/pre-commit-hooks    rev: v2.2.3    hooks:      - id: check-merge-conflict      - id: check-added-large-files      - id: check-ast      - id: check-symlinks      - id: check-yaml      - id: trailing-whitespace      - id: check-json      - id: debug-statements      - id: pretty-format-json        args: ["--autofix", "--allow-missing-credentials"]        exclude: Pipfile.lock  - repo: https://github.com/asottile/seed-isort-config    rev: v2.2.0    hooks:      - id: seed-isort-config  - repo: https://github.com/PyCQA/isort    rev: 5.6.4    hooks:      - id: isort        args: ["--profile", "black"]  - repo: https://gitlab.com/pycqa/flake8    rev: "8f9b4931b9a28896fb43edccb23016a7540f5b82"    hooks:      - id: flake8        additional_dependencies: [ flake8-print ]        files: '\.py$'        args:          - --select=F401,F403,F406,F821,T001,T003  - repo: https://github.com/humitos/mirrors-autoflake    rev: v1.3    hooks:      - id: autoflake        files: '\.py$'        exclude: '^\..*'        args: ["--in-place", "--remove-all-unused-imports"]  - repo: https://github.com/psf/black    rev: 19.10b0    hooks:      - id: black        args: ["--target-version", "py37"]

Frequent scenarios/issues you may encounter#

Mocking the datetime module#

You can use Freezegun ¹⁶ to mock datetime.datetime.now(), datetime.datetime.utcnow(), datetime.date.today(), time.time(), time.localtime(), time.gmtime(), and time.strftime() and return the time that you froze:

@freeze_time("2021-01-14")def test():    assert datetime.datetime.now() == datetime.datetime(2021, 1, 14)

Writing tests is not fun#

True, but it guarantees the sanity of your code, this way you won't be surprised by bugs and complaints, it's a well spent time that should be considered when planning and estimating tasks.

I have a complicated test I don't know where to start#

Try to decompose the code into smaller, more intelligible and probably reusable components that you can test separately.
Try to Mock/Patch any external network calls.
You don't have to test third-party tools as far as they have their own (third-party) tests written and passing (as expected).
Start by writing the easiest tests.
Check the existing tests, someone might have faced the same issue and created a working test already.

I can't find why test is failing#

Think of using Python debugger import pdb; pdb.set_trace() and make the breakpoint in the most suitable lin(s) in your code, by doing so, you could print/check values of variables and side effects in real time, which would help figure out what's wrong. Repeat with different lines until you fix the issue.

Very slow tests#

Check that the Single-responsibility principle is not broken in your code and tests in order to be able to narrow down the search for the faulty component.
Check if the tested code does have any network call (it can be as subtle as a connection to a Celery broker like RabbitMQ or a cache server like Redis) and either mock the call or run the service container when necessary.

Too much copy-paste test code#

If it regards one test: think of using Pytest parameterization.
Otherwise: think of using Pytest fixtures.

Tests are passing locally but are failing in the build/deployment#

Check if your tests are/not deterministic, the most frequent cases are:
- Hardcoded dates make tests fail when tests are run in different time.
- Comparing unordered lists: order of elements may change randomly which causes the comparison check to break, in this case try for example to use Sets instead.
Some inconsistencies can't be predicted in advance like Django/Flask DB migration conflicts (if someone merges to master branch first), so make sure to pull the latest code and rebase/merge as frequently as possible from master branch.
Linting failure: make sure you have correctly configured Pre-commit, normally it should be running Linters on your changes before you commit/push.

Refs#

[1] https://pytest.org/en/latest/plugins.html#requiring-loading-plugins-in-a-test-module-or-conftest-file:
[2] https://pytest.org/en/latest/reference.html#hook-reference
[3] https://docs.pytest.org/en/stable/fixture.html
[4] https://docs.python.org/3/library/unittest.mock.html
[5] https://docs.pytest.org/en/stable/assert.html#assertions-about-expected-exceptions
[6] https://pytest-benchmark.readthedocs.io/en/latest/
[7] https://vcrpy.readthedocs.io/en/latest/
[8] https://pytest-vcr.readthedocs.io/en/latest/
[9] https://github.com/PyCQA/bandit/
[10] https://docs.sonarqube.org/latest/
[11] https://pytest-cov.readthedocs.io/en/latest/readme.html
[12] https://en.wikipedia.org/wiki/Test-driven_development
[13] http://pep8.org/
[14] https://www.python.org/dev/peps/pep-0257/
[15] https://pre-commit.com/
[16] https://pypi.org/project/freezegun/