Tuesday 2 March 2010 — This is nearly 15 years old. Be careful.
One of the more challenging aspects of maintaining a tool like coverage.py is that people use it on complex code bases, and something goes wrong, they report it to me, and I have to dig in and figure out why.
The latest head-scratcher was reported by Christophe Zwerschke as issue 51 on the coverage.py bitbucket tracker. After a few turns to shake out what the issue was, it came down to this: Christophe’s code gets the right answer when run without coverage.py, or when run one way with coverage.py, but computes a different answer when run a second way with coverage.py. This is a real mystery, because it is the first report of code actually behaving differently because coverage.py is measuring it.
This has all the markings of a bug in Python, but it could be my fault. In any case, we have to find out. I won’t be able to get to it for a bit, and why should I have all the fun anyway? So I’m crowd-sourcing it here: maybe a reader will have an insight into what the heck is going on here.
Here is the file, bug51.py (slightly simplified from the ticket):
class Foo:
def __init__(self, name):
self.name = name
def __repr__(self):
return "<Foo %r>" % self.name
class MetaFooList(type):
def __new__(mcs, cls_name, bases, cls_dict):
declared = []
for base in bases:
declared.extend(getattr(base, 'declared', []))
for name, value in cls_dict.items():
if isinstance(value, Foo):
declared.append(value)
declared.sort(key=lambda w: w.name)
cls = type.__new__(mcs, cls_name, bases, cls_dict)
cls.declared = declared
return cls
class FooList(list):
__metaclass__ = MetaFooList
def __init__(self):
super(FooList, self).__init__(self.declared)
def test_foolist():
w = Foo(name="foo")
class W(FooList):
foo = w
w2 = Foo(name="bar")
class W2(W):
bar = w2
foolist = W2()
print "foolist has %d entries" % len(foolist)
if __name__ == '__main__':
test_foolist()
Here are the three runs:
$ python bug51.py
foolist has 2 entries
$ coverage run bug51.py
foolist has 2 entries
$ coverage run --timid bug51.py
foolist has 4 entries
An explanation about that last run: the --timid switch forces coverage.py to use a trace function written in Python rather than its fancier one written in C. Ironically, I added the switch as a way to use a gentler tracing mechanism that wouldn’t interfere so much with other packages using a trace function, to prevent bizarre problems. But now, it seems to be the source of a real problem itself.
Trace functions can do strange things that can affect the running program, but the ones in coverage.py don’t, or at least they aren’t supposed to. This test code has plenty of twisty turns, but still, how can it get different answers from two implementations of the same trace function?
Anyone up for the challenge of figuring out what’s going on here? If not, I’ll get to it eventually, and report back.
Comments
U:\>python --version
Python 2.6.2
U:\>coverage --version
Coverage.py, version 3.3. http://nedbatchelder.com/code/coverage
U:\>coverage run bug51.py
foolist has 2 entries
U:\>coverage run --timid bug51.py
foolist has 2 entries
U:\>
$ python --version
Python 2.6.4
$ coverage --version
Coverage.py, version 3.2. http://nedbatchelder.com/code/coverage
$ coverage run bug51.py
foolist has 2 entries
$ coverage run --timid bug51.py
foolist has 2 entries
This is running om Ubuntu 9.10 in a virtualenv.
$ python bug51.py
foolist has 2 entries
$ coverage run bug51.py
foolist has 4 entries
$ coverage run --timid bug51.py
foolist has 4 entries
So on 2.4, coverage always gets the wrong answer, on 2.5, only timid gets the wrong answer, and on 2.6, it always gets the right answer.
This points more and more to a Python issue...
Since this kind of code is often used in unit tests, I was thinking about a workaround. The following seems to work, any better ideas? This quirk should also be mentioned somewhere in the coverage documentation or FAQ as a known issue.
Add a comment: