Sunday 22 October 2006 — This is more than 18 years old. Be careful.
Here’s a debate that arose recently, about the extent to which dynamic typing can be taken, and whether it is too far. We have a function that takes a list of ids, but it can also be used in a way that gets the ids from another well-known place. It was originally coded like this:
def insert_ids(ids):
""" Insert the ids, or the global ids if ids is 'global'.
"""
if ids == 'global':
ids = get_global_ids()
for id in ids:
# blah blah blah
# Now we can insert ids two different ways:
insert_ids([1,2,17,23])
insert_ids('global')
ids is an argument that can either be a list of ids, or the string ‘global’, meaning go off and get a list of ids from somewhere else. But this use of the same argument as either a string or a list felt funny, so we changed it to this:
def insert_ids(ids=None, use_global=False):
""" Insert the ids, or the global ids if use_global is True.
"""
if use_global:
ids = get_global_ids()
for id in ids:
# blah blah blah
# Now we can insert ids two different ways:
insert_ids([1,2,17,23])
insert_ids(use_global=True)
But now we have two arguments, both of which have to be defaultable, making it possible to call the function with no arguments, which is not a valid form of the function. Am I being too squeamish about the dynamic nature of the first form? Although Python doesn’t mind, it feels strange to me for a variable to sometimes be a string and sometimes be a list. Is this pythonic? Or just a confusing abuse of power?
Comments
But the first form in Python burys the argument type handling in the body of the function. So for that reason alone I might say the second form is superiour, because it more effectively expresses the valid input arguments for the function, which is something that needs to be communicated regardless.
Lately, I've been looking at multimethods as an alternative way of implementing polymorphic behavior.
-Sw.
def insert_ids(ids=None):
if ids is None: ids = get_global_ids()
...
if you later have globals2 (another special case), then that won't work.
in common lisp, this idiom is kinda used:
(defun test (&key a b)
;; one should be non-NIL
(when (and (null a) (null b)) (error .....))
I don't know that it's the wrong way to go, but it's got its own downsides...
And re: the 7 argument thing, it sounds like these parameters are almost worthy of being refactored out into a class, instead of writing "an explosion of combinations". If it's so complex, wouldn't it be the thing to do to encapsulate the complex behavior?
* do as sri said above, or
* have ids be the only arg and interpret an empty ids as "use the global ids"
Good luck. I'm interested as to what you end up deciding.
In addition, using a global is something that I think should probably be an either/or proposition. Either the function always uses globals in some respect and always documents it, or it delegates that to the caller in an unambiguous way.
I would create a function that does not have the argument be default-able, and if the caller of that function requires a global argument, then the caller should themselves call the function like:
insert_ids(get_global_ids())
This way it is obvious, at the calling point, that global resources are being used here. Most of the time, it is bad to add verbosity to the point in the program where a function is invoked. But in the case of the use of global variables, I think that this verbosity is justified because of the potential pitfalls of not seeing (or not having a static analyzer see) where the uses are.
If a similar situation arises independently of global considerations, then I would favor an alternative where either a separate function (above) would specify that a special default value is used. Or, if there were a more complex set of cases, then to pass a set of flags to a configuration object which would automatically set up defaults according the flags and where the user of the configuration object would set the rest of the fields manually.
I would also go with insert_ids(get_global_ids()). I don't see this as a case of dynamic vs. static. I see it as introducing extra code paths in the name of questionable convenience. You talked about multiple defaults needing an explosion of combinations - well, in a single function, one might argue that the explosion is still there for testing and understanding. Certainly the cyclomatic complexity is higher, and I believe minimizing it is an important goal for reliable, testable software.
I'm inclined to think that a seven-argument function for which you're even tempted to add these sorts of alternatives is too complex altogether. But this discussion is too abstract for me to say so confidently.
GLOBAL = object()
def insert_ids(ids=GLOBAL):
if ids is GLOBAL:
ids = get_global_ids()
...
Using separate methods may be a better solution for some cases, but that depends on how this API is used, not what it does on the inside. Good API design is about usage patterns, not implementation details.
insert_ids(get_global_ids)
Kind of like re.sub, where you can pass either a string or a function as the replacement argument.
Of course, you've said that the real function is more complex, and that makes all the difference...
The time where I think going more dynamic is cool is something where you're wanting to use parameter in a similar way regardless of the type. For instance, say you have a function or method that uses a list for something... it would be good if it could also intelligently take a dictionary or a list of lists and handle it appropriately...
insert_ids should take ANY type as an argument and insert it. String, list, dictionary. If argument is a arbitary string "globals" the first form won't work, but you might want to insert that exact string.
Now you might argue that IDs are a type of some sort, and a string is not valid. Maybe that is true today, but will it be tommorow? Maybe not, better not risk it - those who are assigned to make strings a valid id type will have enough work without having to refactor any place where "global" is passed in as a string.
insert_ids(True) should be read as either "Turn on inserting IDs", or Insert the ID True - if that makes sense. No programmer would read it as insert global IDs. There should be a different function insert_global_ids, which makes it clear when you read the code what is going to happen. Readability is good.
I would give careful consideration to the other comment that you might really want a class here, not a function. Without knowing your problem we cannot say for sure, but it sounds reasonably.
def insert_ids(ids):
...
def insert_global_ids():
ids = get_global_ids()
return insert_ids(ids)
def insert_other_flag_ids():
ids = get_other_flag_ids()
return insert_ids(ids)
def insert_ids(ids):
""" Insert the ids, or the global ids if ids is 'global'.
"""
for id in ids:
# blah blah blah
# Now we can insert ids two different ways:
insert_ids([1,2,17,23])
insert_ids(get_global_ids())
is as functional and is errorphone
First, do you really need a special case, or are you just being paranoid about type safety? Let the caller take care of whether they mean what they say, and make your function do *one* job clearly and simply. (This supports the 'insert_ids(get_global_ids())' idea earlier.)
Second, do you *really* need a special case, or are you making your function too complex? Be very suspicious of functions that are written to do two different things depending on their input, and split them so that both are simple and the caller can be explicit about what they want. This doesn't preclude factoring out the code that's common to both of them, of course.
Third, if you actually need a special case, can it be None? This is the idiomatic Python "sentinel value", and it looks like the code posted by 'sri' above. Note that if you're squeamish about using None, but don't have a specific reason not to use it, use it; other programmers will thank you for following convention.
Fourth, if you have decided that a magic sentinel value is called for but None is already taken for some other purpose, don't use a string. Use a unique do-nothing object, defined at the module level so callers can easily get at it, like 'Dmitry Vasiliev' showed. You won't accidentally use it, because it's defined only in one place (you're comparing by 'is', remember) and it's not used for anything except indicating the special case.
Fifth, there is no fifth. If you've come to the end and think it's too complex, it probably is. Start at the top again.
class Resource(object):
....def __init__(self, node):
........attributes = self.ATTRIBUTES
........if not isinstance(attributes, (tuple, list)):
............attributes = (attributes,)
............for attribute in attributes:
................# do stuff
class TranslateTransform(Resource):
....ATTRIBUTES = "translate"
class RotateTransform(Resource):
....ATTRIBUTES = "axis", "angle"
As you can see, ATTRIBUTES can be both a string or a sequence.
While this goes against most stuff from OOP books, I feel it is nicer.
Add a comment: