Late initialization with mypy

Sunday 12 February 2023

Let’s say you have a complex class with a number of attributes. The class is used in a few different ways, so sometimes the attributes are available, but sometimes they haven’t been initialized yet. Because of global knowledge about how the class is used, we know which paths are certain to have the attributes, and which might not have them.

[UPDATE: I’ve changed my mind: Late initialization, reconsidered]

(If you are interested, the real code I’m thinking about is from coverage.py, but this post has toy examples for clarity.)

Before static type checking, I’d initialize these attributes to None. In the certain-to-exist code paths, I’d just use the attributes. In the uncertain code paths, I’d check if an attribute was None before using it:

# Original untyped code.
class Complicated:
    def __init__(self):
        self.other = None

    def make_other(self):
        self.other = OtherThing()

    def certain_path(self):
        self.other.do_something()

    def uncertain_path(self):
        if self.other is not None:
            self.other.do_something()

How should I add type annotations to a situation like this? The most obvious approach is to declare the attribute as Optional. But that means adding asserts to the certain paths. Without them, the type checker will warn us that the attribute might be None. Type checkers don’t have the global understanding that makes us certain about them being available on those paths. Now we need extra code for both certain and uncertain paths: asserts for one and run-time checks for the other:

# Simple Optional typing.
class Complicated:
    def __init__(self):
        self.other: Optional[OtherThing] = None

    def make_other(self):
        self.other = OtherThing()

    def certain_path(self):
        assert self.other is not None
        self.other.do_something()

    def uncertain_path(self):
        if self.other is not None:
            self.other.do_something()

This is a pain if there are many certain paths, or many of these attributes to deal with. It just adds clutter.

A second option is to have the attribute exist or not exist rather than be None or not None. We can type these ghostly attributes as definitely not None, but then we have to check if it exists in the uncertain paths:

# Ghost: attribute exists or doesn't exist.
class Complicated:
    def __init__(self):
        # declared but not defined:
        self.other: OtherThing

    def make_other(self):
        self.other = OtherThing()

    def certain_path(self):
        self.other.do_something()

    def uncertain_path(self):
        if hasattr(self, "other"):
            self.other.do_something()

This is strange: you don’t often see a class that doesn’t know in its own code whether attributes exist or not. This is how I first adjusted the coverage.py code with type annotations: six attributes declared but not defined. But it didn’t sit right with me, so I kept experimenting.

A third option is to use two attributes for the same value: one is typed Optional and one is not. This lets us avoid asserts on the certain paths, but is really weird and confusing:

# Two attributes for the same value.
class Complicated:
    def __init__(self):
        self.other: OtherThing
        self.other_maybe: Optional[OtherThing] = None

    def make_other(self):
        self.other = self.other_maybe = OtherThing()

    def certain_path(self):
        self.other.do_something()

    def uncertain_path(self):
        if self.other_maybe is not None:
            self.other_maybe.do_something()

But if we’re going to use two attributes in the place of one, why not make it the value and a boolean?

# Value and boolean.
class Complicated:
    def __init__(self):
        self.other: OtherThing
        self.other_exists: bool = False

    def make_other(self):
        self.other = OtherThing()
        self.other_exists = True

    def certain_path(self):
        self.other.do_something()

    def uncertain_path(self):
        if self.other_exists:
            self.other.do_something()

This is about the same as “exists or doesn’t exist’, but with a second nearly-useless attribute, so what’s the point?

Another option: the attribute always exists, and is never None, but is sometimes a placebo implementation that does nothing for those times when we don’t want it:

# Placebo

class OtherPlacebo(OtherThing):
    def do_something(self):
        pass

class Complicated:
    def __init__(self):
        self.other: OtherThing = OtherPlacebo()

    def make_other(self):
        self.other = OtherThing()

    def certain_path(self):
        self.other.do_something()

    def uncertain_path(self):
        self.other.do_something()

A philosophical quandary about placebos: should they implement all the base class methods, or only those that we know will be invoked in the uncertain code paths? Type checkers are fine with either, and run-time is of course fine with only the subset.

In the end, I liked the placebo strategy best: it removes the need for any checking or asserts. I implemented the placebos as bare-bones with only the needed methods. It can make the logic a bit harder to understand at a glance, but I think I mostly don’t need to know whether it’s a placebo or not in any given spot. Maybe six months from now I’ll be confused by the switcheroos happening, but it looks good right now.

Comments

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.