Testing your python package as installed

For small projects and projects you only intend to run on your local computer, it is often convenient that Python searches the local directory for modules to import, but once your project grows to the point where you want to install it, this behavior can lead to misleading test results. The problem is that it makes it very easy to accidentally test your package as it exists in the repository rather than as it will be installed on your users' systems. This issue has been addressed in some detail in Hynek Schlawack's "Testing and Packaging" post, and Hynek's solution (using a src/ directory) is the one that I recommend; in this post, I will briefly provide two other options on how to configure your system if you are, for whatever reason, unwilling to switch to a src/ layout. [1]

Missing submodules

Before we get into the solutions, it is worth explaining what problems this can cause. One of the most common errors that will be missed by testing against your local directory is that the build system will fail to install one of your submodules [2]. This happens because setuptools expects you to list every package you want to install and it will not recursively include nested submodules. [3]. So for example if you have a project laid out like this:

$ tree .
.
├── mypkg
│   ├── __init__.py
│   └── subpkg
│       └── __init__.py
├── pyproject.toml
└── setup.cfg

And your packages are listed in setup.cfg like this:

[options]
packages =
    mypkg

because module installation is not recursive, this will not install the mypkg.subpkg submodule. If you import mypkg.subpkg when running a script in from the repository root, you will not get an error, but if you run it from any other directory (as your users will), you'll get an ImportErrror!

$ pip install .
...
Successfully installed mypkg-0.0.1

$ python -c "from mypkg import subpkg; print('Success!')"
Success

$ cd ..

$ python -c "from mypkg import subpkg; print('Success!')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: cannot import name 'subpkg' from 'mypkg' [Truncated]

We can fix this either by adding mypkg.subpkg to the setup.cfg or by using find_packages (called find: in a setup.cfg file), a function in setuptools for recursively including subpackages. [4]

Unfortunately, most Python test runners are executed from the repository root, and it's fairly easy for them to default to importing directly from the repository and not from an installed package.

Using a src/ layout

The ideal solution is to structure your project such that it's not possible to import your module from the repository root by placing the code in a dedicated src/ directory. Between Hynek's aforementioned post and the setuptools documentation, it is fairly well-documented how to do this, so despite the fact that this is the solution I would recommend, I won't go into details here.

Using pytest

pytest can be invoked either as a module (using python -m pytest) or as a command line script (using pytest), and these actually treat the local directory differently. For module invocations in general, the local path will always be on sys.path, whereas invocations of pytest will not add the current directory to your Python path. As such, only python -m pytest is in danger of testing against your local repository.

The main reason you would want to use a module invocation rather than directly invoking pytest is that the pytest alias in your local environment may point to an installation other than the one you are expecting — e.g. pytest installed for Python 2 when you are expecting Python 3, or an old version of pytest installed somewhere on your path. You will need to weigh the dangers of environment-related bugs against the dangers of including the local directory on your Python path for yourself. If you are using something like tox where each invocation is in a dedicated virtual environment, it is probably reasonable to prefer a bare pytest invocation in your test command.

Running your tests from another directory with tox

If you really want to make sure that you are not accidentally counting on your tests being run from the repository root, tox makes it simple enough to run your tests from any arbitrary directory by setting the changedir parameter. Starting with the package defined above and adding the following tox.ini:

[testenv]
description = Run the tests under {basepython}
deps = pytest
commands = python -m pytest {posargs}

any tests that attempt to import mypkg.subpkg will erroneously succeed, because the tests are run from the repository root; but with two minor modifications, we can configure tox to run them from a temporary directory:

[testenv]
description = Run the tests under {basepython}
deps = pytest
changedir = {envtmpdir}
commands = python -m pytest {posargs} {toxinidir}

The first thing we did was add changedir = {envtmpdir}, which tells tox to change the directory to a temporary directory associated with the current environment being run. [5] The other thing was to add {toxinidir} to the pytest invocation, to tell pytest where to look for the tests. Using this configuration, invoking tox will now fail as expected.

If you'd like to see these pieces put together, I've created a minimal repository that demonstrates this.

Conclusions

As we've seen, there are a few different ways to fix this problem, and none of them are mutually exclusive with the others. It is perfectly possible and reasonable to use a src/ directory and run your tests with a bare pytest invocation and have tox run your tests from a temporary directory. Each of these methods has their own advantages — running your tests from a temporary directory, for example, prevents any reliance on the repository structure, not just issues with the import path, and the src/ layout makes find_packages much less likely to pull in extraneous modules. [6] I do not know all the costs and benefits associated with each method, but I encourage you to examine the trade-offs associated with the different ways to solve this problem and pick at least one of them.

Acknowledgements

I first learned about the changedir solution from Mark Williams' blog, and I was inspired to write this blog post when I found that his post is now only available via the Internet Archive (and thus does not show up in search engine results). If you are planning to use the changedir solution, you may also want to check out his post, which includes some additional tips about how to configure coverage that I have not reproduced.

Footnotes

[1]	Like if your build backend doesn't support it.

[2]	You can see missing submodule issues in several open source projects, for example dateutil and pytype were both hit by this.

[3] Not recursively including nested modules is a fairly reasonable choice, since it gives users freedom in how they lay out their package in any arbitrary way - anything that is not explicitly listed is not in the package. The big problem is that it's a somewhat counter-intuitive default, since most users lay out their package in such a way that recursively including all packages in a given folder is perfectly fine; setuptools provides find_packages for this, but using that without a src/ layout brings its own problems [6] ^{(yes, that is a footnote on a footnote)}.

[4]	I recommend using `find_packages` only with a `src/` layout, and I generally think it's a good idea to keep as much of your package configuration in `setup.cfg` as possible and this guide in the setuptools documentation explains how to set up that configuration (it is not intuitive).

[5] In Mark Williams' post where I first learned about this, now only available on the Internet Archive, he recommends using {toxworkdir}, which is another equally valid choice. I tend to use envtmpdir only to emphasize the fact that the choice is arbitrary, but it is worth noting that as a temporary directory, {envtmpdir} will be cleared every time the particular test environment is run, whereas {toxworkdir} or {envdir} will not be automatically cleared.

[6]	(1, 2) Without the `src/` layout, `find_packages()` will pick up anything that has an `__init__.py` file in the top level directory and install them as top level packages, which is why projects like `flask-admin` and `pulumi` were accidentally installing packages like `examples` and `tests`.