Sunday 30 March 2008 — This is almost 17 years old. Be careful.
I’ve noticed a useful pattern in some Python polymorphic functions. The bulk of the function is written assuming the input is of one preferred type. But other types are also accepted by first converting them (evolving them) up to the preferred type.
A common example is a function that operates on an open file, but will also accept a file name:
def doit(f):
# If f is a string, it's a filename to be opened
if isinstance(f, basestring):
f = open(f)
# .. do stuff with the file ..
A richer example comes from Jared Kuolt’s StaticGenerator. Here the goal is to build a list of URLs, but the function accepts strings as well as a variety of Django ORM objects:
extracted = []
for resource in resources:
# A URL string
if isinstance(resource, (str, unicode, Promise)):
extracted.append(str(resource))
continue
# A model instance; requires get_absolute_url method
if isinstance(resource, Model):
extracted.append(resource.get_absolute_url())
continue
# If it's a Model, we get the base Manager
if isinstance(resource, ModelBase):
resource = resource._default_manager
# If it's a Manager, we get the QuerySet
if isinstance(resource, Manager):
resource = resource.all()
# Append all paths from obj.get_absolute_url() to list
if isinstance(resource, QuerySet):
extracted += [obj.get_absolute_url() for obj in resource]
In this case, there are two cases that branch off to get handled completely, and three different types which evolve up through each other.
There’s something very satisfying about this style of code. The polymorphism comes not from switching on the type and doing different actions for different types, but from evolving types one to the next until you have the type needed to operate on.
Statically typed languages can use a similar design, where overloaded functions adapt their arguments and call each other. The dynamic language pattern is more flexible in that different kinds of adaptation can be used, not just type-based. Here’s a hypothetical example:
def doit_with_html(h):
if h.startswith('http:'):
# Turn a URL into its HTML data.
h = urllib2.urlopen(h).read()
#.. do something with the HTML ..
Here we aren’t distinguishing between two different types, but between two different structures of string. If the string smells like a URL, we use it to fetch HTML to operate on, otherwise we assume the string is HTML to begin with.
This style of coding is a great way to make functions more flexible and useful to callers. It’s not only concise, but it gives you a natural way to extend the function to operate on more types. Often a new kind of input can be accommodated simply by adding an adaptation clause to the top of the existing function, rather than proliferating similar functions with similar (but different) names.
Comments
I think you are saying you prefer style 3, is that right?
In this case, I prefer the explicit check for the types you are looking for.
The general concept is that you have some fundamental type — file — and a broader type — file or basestring — whose members all denote some file (are normalized into some file) under the "file designator" interpretation.
Another Python example would be the dict constructor, which could be viewed as accepting a dict designator:
>>> dict([(1,2),(3,4)])
{1: 2, 3: 4}
>>> dict({1:2, 3:4})
{1: 2, 3: 4}
(The page I linked is mostly setting out the rules for how "X is a y designator" in the specification is to be interpreted.)
http://en.wikipedia.org/wiki/Generic_function
but generic functions requires some additional implementation in infrastructure
http://www.python.org/dev/peps/pep-3124/
The later example is a
http://en.wikipedia.org/wiki/Type_system#Dependent_types
and I think it is currently a hot topic in type research.
I felt it important to accept as many types of arguments as possible, and as easily as possible. The evolution of the object will basically always filter into the same type of object which we can then use in one way, which is the point of course.
2) It occurs to me that this pattern might be called "semantic typing", as follows: the REAL type in your example is "file"; whether the user passes in a Python type of string or file handle is exactly the sort of detail that libraries should manage without troubling programmers. That of course suggests that the pattern is useful only insofar as a given "semantic type" maps cleanly to 2 or more language types ...
Add a comment: