Randomly sub-setting test suites

Sunday 14 January 2024

I needed to run random subsets of my test suite to narrow down the cause of some mysterious behavior. I didn’t find an existing tool that worked the way I wanted to, so I cobbled something together.

I wanted to run 10 random tests (out of 1368), and keep choosing randomly until I saw the bad behavior. Once I had a selection of 10, I wanted to be able to whittle it down to try to reduce it further.

I tried a few different approaches, and here’s what I came up with, two tools in the coverage.py repo that combine to do what I want:

A pytest plugin (select_plugin.py) that lets me run a command to output the names of the exact tests I want to run,
A command-line tool (pick.py) to select random lines of text from a file. For convenience, blank or commented-out lines are ignored.

More details are in the comment at the top of pick.py, but here’s a quick example:

Get all the test names in tests.txt. These are pytest “node” specifications:
```
pytest --collect-only | grep :: > tests.txt
```

Now tests.txt has a line per test node. Some are straightforward:

tests/test_cmdline.py::CmdLineStdoutTest::test_version
tests/test_html.py::HtmlDeltaTest::test_file_becomes_100
tests/test_report_common.py::ReportMapsPathsTest::test_map_paths_during_html_report

but with parameterization they can be complicated:

tests/test_files.py::test_invalid_globs[bar/***/foo.py-***]
tests/test_files.py::FilesTest::test_source_exists[a/b/c/foo.py-a/b/c/bar.py-False]
tests/test_config.py::ConfigTest::test_toml_parse_errors[[tool.coverage.run]\nconcurrency="foo"-not a list]

Run a random bunch of 10 tests:
```
pytest --select-cmd="python pick.py sample 10 < tests.txt"
```
We’re using --select-cmd to specify the shell command that will output the names of tests. Our command uses pick.py to select 10 random lines from tests.txt.

Run many random bunches of 10, announcing the seed each time:

for seed in $(seq 1 100); do
    echo seed=$seed
    pytest --select-cmd="python pick.py sample 10 $seed < tests.txt"
done

Once you find a seed that produces the small batch you want, save that batch:
```
python pick.py sample 10 17 < tests.txt > bad.txt
```
Now you can run that bad batch repeatedly:
```
pytest --select-cmd="cat bad.txt"
```
To reduce the bad batch, comment out lines in bad.txt with a hash character, and the tests will be excluded. Keep editing until you find the small set of tests you want.

I like that this works and I understand it. I like that it’s based on the bedrock of text files and shell commands. I like that there’s room for different behavior in the future by adding to how pick.py works. For example, it doesn’t do any bisecting now, but it could be adapted to it.

As usual, there might be a better way to do this, but this works for me.

Randomly sub-setting test suites

Comments

Add a comment: