Fast CI is one of the key elements of efficient software development. In most cases, the test phase is the bottleneck considering CI build times. In this blog post, I introduce pytest-split, a new pytest plugin, which aims to help in decreasing Python test execution times by splitting the full test suite to n optimally divided "sub suites" which can be executed in parallel e.g. by using parallelization features of GitHub Actions.
Pytest-split
Motivation
The world is filled with "legacy" software. Software which is not up-to-date with the current best practices. Enormous code bases with huge test suites. Things weren't slow at the beginning but then years went by. New features were built on top of existing ones. Now it takes tens of minutes (even hours) to see the relieving green bubble on top of the commit message after pushing local changes. Roughly speaking, the bigger the test suite, the slower the CI.
How to get a cheap boost to CI execution times in a Python project in which tests are the bottleneck? If the tests are executed sequentially, parallelization is the simple answer. Often there are also optimizations possibilities in the existing tests but hunting those down is not usually the cheap option. Developer hours are expensive. Even if the execution time of every single test is squeezed to bare minimum, it usually does not scale well. What if the already big suite doubles its size during the next 12 months. If it's a web app, one solution could be to go towards micro services. Yeah well, let's not go there when discussing cheap solutions. Parallelization it is.
Pytest-xdist has been my trusted workhorse for test parallelization for a couple of years. It's really cheap: basically one new dependency and a single command line parameter, and you're suddenly running pytest on all available cpu cores. Then the hard parts. If the suite contains integration tests which depend on e.g. databases things will get tricky. In-memory databases are available for most cases but it might require significant refactoring efforts to onboard them. Additionally, there are some obvious benefits in testing against real dependencies (e.g. databases) when possible. Furthermore, huge test suites tend to have some weak points considering e.g. isolation and interdependency. It's not uncommon to see that some tests require other tests to be executed before them in order to have the desired result. Fragile test suites tend to produce consistent results only when tests are ran in certain order. If the test suite has these sorts of weaknesses, getting pytest-xdist reliably up and running might be painful as xdist does not have any guarantees about the order without some customization.
One of the main motivations for pytest-split was to have a way to split the whole test suite such that the order in which tests are executed could be maintained. In other words, pytest-split tries to tackle the pain points of fragile test suites (in a cheap way).
To get the most out of parallel execution, it's usually beneficial to distribute the work across multiple different executors (e.g. different machines). Pytest-xdist has also support for distributed execution. However, the basic idea is that there's one master node which distributes the work to slaves. This basically means that there has to be communication channel between the master and the slaves. This is not always viable. For example, most free CI systems don't provide facilities for communication between different executors.
Pytest-split aims to provide a way to parallelize tests also in environments in which multiple executors can be utilized but there's no communication channel available between the executors.
Before implementing pytest-split, I found pytest-test-groups which was very close to what I was looking for. It provides a way to split the whole suite into groups which can be ran separately (e.g. on independent executors). However, there was one aspect I wasn't happy about. The tests can be only split into equally sized junks. For example, if I have 100 tests and I want to split them to 10 groups, each group will contain 10 tests. This is sufficient if there are no significant differences in the execution times of individual tests. Otherwise you might end up in a situation in which fastest group executes in e.g. 1 minute while the slowest takes 3 minutes.
Pytest-split aims to split the tests into groups such that the execution time of each group is nearly equal. This often means that the sizes of groups are not identical considering the amount of tests they contain.
The core concepts
In order to split the suite to "sub suites" based on execution time, pytest-split needs to know the execution time of each individual test. This information is provided as a json file (test durations file) which is to be stored e.g. as part of the repository in which the tests live. Pytest-split provides a command line parameter for generating this file.
Old tests get modified or removed and new tests get added while development continues, that's the natural life cycle of a healthy project. If there are tests which are not part of the test durations file, pytest-split assumes average execution time for them. Thus, there's no real need update the durations file with every changeset. For example, I have been using pytest-split in a project which contains around 5k tests and is under active development. Based on my experience, once-every-two-weeks update is enough. I assume that the bigger the whole suite, the less often the durations file needs to be updated. OTOH, the more sub-suites, the more frequent updates are likely required in order to have better results.
The splitting algorithm itself is quite straightforward after the data about execution times of individual tests is available. In short, the algorithm estimates how long it will take to execute the whole suite and then divides that by the number of sub suites to be executed. Then it loops over the whole suite to find the optimal "split points".
If you're interested about implementation details, source is available here.
Basic usage
The usage is pretty simple as pytest-split just provides a couple of command line options for pytest.
To store / update the durations file:
pytest --store-durations
Then once you have the durations file available, you can split and run the test suite in e.g. 3 sub suites:
pytest --splits 3 --group 1
pytest --splits 3 --group 2
pytest --splits 3 --group 3
If you don't want to use the default path for durations file (.test_durations in current working directory), you can provide a custom path with --durations-path
command line option.
Using pytest-split with GitHub Actions
GitHub Actions have been gaining some serious attraction lately. I haven't used GitHub Actions based CI workflows before so I thought it'd be interesting to try them out. Up until now I have been using Travis in my hobby projects. A while back I read that GitHub is quite generous with public repos: 20 concurrent jobs even with the free plan. This started to sound like a perfect match with pytest-split.
I created a demo project to try it out. I implemented some dummy Python functionality and dummy tests for it. I basically just wanted to see if I can get GitHub Actions based CI to work well together with pytest-split, the meaningfulness of the code logic didn't matter.
The yaml syntax used in GitHub Action workflows was intuitive and thus easy to learn. I used a matrix strategy for parallelizing the sub suites to different executors. This worked nicely. I also wanted to include coverage reporting as it's usually required (or at least should be) in real life projects. Combined coverage reporting was achieved by publishing the coverage data file as an artifact in each parallel executor after which they are downloaded and combined in a separate executor after all parallel test executors have finished. This made it possible to check that certain coverage level was reached while the execution of tests was split to different independent executors.
Overall, my first experiences with GitHub Actions are really positive. Furthermore, GitHub Actions seem to be a great match with pytest-split. If you're interested in digging deeper, have a look at the demo project and the workflow configuration.