Slim comparisons

Thursday 26 January 2012This is 13 years old. Be careful.

Hanging out in the #python IRC channel today, I learned something new about Python comparisons. It isn’t so much a new detail of the language, as a way to make use of a detail, a clever technique that I hadn’t seen before.

When defining a class, it’s often useful to define an equality comparison so that instances of your class can be considered equal. For example, in an object with three attributes, the typical way to define __eq__ is like this:

class Thing(object):
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c

    def __eq__(self, other):
        print "Comparing %r and %r" % (self, other)
        return (
            self.a == other.a and
            self.b == other.b and
            self.c == other.c
            )

When run, it shows what happens:

>>> x = Thing(1, 2, 3)
>>> y = Thing(1, 2, 3)
>>> print x == y
Comparing <Thing 37088896> and <Thing 37088952>
True

Here the __eq__ method compares the three attributes directly on the self and other objects, and returns a boolean, a simple direct comparison.

But on IRC, a different technique was proposed:

class Thing(object):
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c

    def __eq__(self, other):
        print "Comparing %r and %r" % (self, other)
        return (self.a, self.b, self.c) == other

Now when we run it, something unusual happens:

>>> x = Thing(1, 2, 3)
>>> y = Thing(1, 2, 3)
>>> print x == y
Comparing <Thing 37219968> and <Thing 37220024>
Comparing <Thing 37220024> and (1, 2, 3)
True

Our __eq__ is being called twice! The first time, it’s called with two Thing objects, and it tries to compare a tuple of (1, 2, 3) to other, which is y, which is a Thing. Tuples don’t support comparison to Thing’s, so it returns NotImplemented. The == operator handles that case, and relying on the commutative nature of ==, tries swapping the two arguments. That means comparing y to (1, 2, 3), which calls our __eq__ again. Now it compares (1, 2, 3) to (1, 2, 3), which succeeds, producing the final True result.

This is an interesting technique, but I’m not sure I like it. For one thing, the code doesn’t read clearly. It’s comparing a tuple to an object, which isn’t supported. It only makes sense when you keep in mind the argument-swapping dance.

For another, it makes operations work that maybe shouldn’t:

x == (1, 2, 3)
(1, 2, 3) == x

I don’t know that I want these comparisons to succeed. It exposes internals that should be hidden. Of course, why would a caller who didn’t know the internals try a comparison like this? But things like this have a way of creeping out to bite you.

I’m glad to have a better understanding of the workings of comparisons, but I’m not sure I’ll write them like this.

Comments

[gravatar]
This is interesting to know, that this is how comparisons work in Python. But I prefer the former version. The latter is not very clear and I doubt many people would know what happens behind the scene, to understand it quickly.
[gravatar]
One small clarification: the "NotImplemented" singleton is not an exception. Instead, it gets *returned* from the __eq__ call. (NotImplementedError *is* an exception, but it plays no part in comparisons, or any other binary operations)

That doesn't change your overall point, though: the shorthand version is a bad idea because it doesn't express the intent clearly and allows comparisons that should trigger an exception.

To compare a group of attributes, it *can* be convenient to write it like this, though:

attrs = "a b c".split()
return all(getattr(self, x) == getattr(other, x) for x in attrs)
[gravatar]
I agree, it goes against Python's "prefer explicit over implicit" principle. And if you want your objects to be only equal to themselves, why make them also equal to tuples? It'd be a bug waiting to happen.
[gravatar]
@Nick, thanks for the exception/singleton clarification, I've fixed the text. And the shorthand for comparing a number of attributes is very nice.
[gravatar]
If you're not happy for a Thing to be equal to a three element tuple why are you happy for it to be equal to some arbitrary object which happens to have 'a', 'b' and 'c' attributes (and maybe 'd' and 'e' attributes which are not looked at)? It's difficult to know where to draw the line with this duck-typing stuff.

I think that if other is not a Thing then __eq__ should return False immediately. The interesting (i.e., difficult) case is when Thing is in a class derived from Thing but that would be a bit of a digression.
[gravatar]
@Ed: you are right, this is one area where even Python developers are forced to confront is-instance questions. What should Thing.__eq__ insist about "other"? I left out that whole topic to focus on the comparison check itself, but it is also thorny.
[gravatar]
The "What counts as quacking like a duck?" question is actually the problem Abstract Base Classes are designed to help with - because they support explicit registration, you can use them in isinstance() checks without excessively constraining the types you allow, or inadvertently accepting things you don't want.

Of course, ABCs themselves can be duck-typed according to appropriate protocols (e.g. "isinstance(obj, collections.Hashable)" will accept anything with a __hash__ method, whether it is explicitly registered or not).
[gravatar]
@Ed, @Nick: When the two arguments are from different classes (even class/subclass, or registered with the same ABC), I don't see how it would be safe to allow equality to return True. In fact, I would even raise an exception, rather than return False. I don't know why Python's built-in classes just return False - isn't it dangerous?

After all, even (1, 2, 3) != [1,2,3]. So equality must imply really similar behavior, not just similar contents. And clearly, two different classes do not, normally, behave the same. Furthermore, if you allow comparison between different classes, how do you ensure the commutative property (even with subclasses of the same class)?

What are the use cases of equality between different classes?

As to duck typing, I never fully bought into the idea, since name collisions (like the ones in your example) are so unpredictable and so hard to debug. And even with duck typing, shouldn't it be for the methods only, not attributes?

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.