Running a Python file as main

Tuesday 7 April 2009This is almost 16 years old. Be careful.

I tried running a set of unit tests from work in the latest version of coverage.py, and was surprised to see that the tests failed. Digging into it, I found that the value of __file__ was wrong for my main program. It used that value to find the expected output files to compare results against, and since the value was wrong, it didn’t find any expected output files, so the tests failed.

Consider this main program, myprog.py:

print "__file__ is", __file__
print "__name__ is", __name__

When run from the command line as “python myprog.py”, it says:

__file__ is myprog.py
__name__ is __main__

The way coverage.py ran the main program could be boiled down to this:

# runmain1.py: run its argument as a Python main program.
import sys
import __main__

mainfile = sys.argv[1]
execfile(mainfile, __main__.__dict__)

Running “python runmain1.py myprog.py” produces:

__file__ is runmain1.py
__name__ is __main__

Because we imported __main__, and used its globals as myprog’s globals, it thinks it is runmain1.py instead of myprog1.py. That’s why my unit tests failed: they tried to find data files alongside coverage.py, rather than alongside the unit test files.

This is a better way to do it:

# runmain2.py: run its argument as a Python main program.
import imp, sys

mainfile = sys.argv[1]

src = open(mainfile)
try:
    imp.load_module('__main__', src, mainfile, (".py", "r", imp.PY_SOURCE))
finally:
    src.close()

This imports the target file as a real __main__ module, giving it the proper __file__ value.

The old execfile __main__ technique is used in lots of tools that offer to run your python main files for you, and I’m not sure why more people don’t have problems with it. Probably because __file__ manipulation is uncommon. I’ve updated coverage.py to use the new technique. I hope there isn’t a gotcha I’m overlooking that means it’s a bad way to do this.

Updated: I found the gotcha: it creates a compiled file.

Comments

[gravatar]
Wouldn't inserting
__main__.__file__ = mainfile
before the call to execfile() be an easier fix for this problem?
[gravatar]
Wouldn't the runpy stdlib module help, or is it designed for a completely different use case?
[gravatar]
@Fredrik: I started down that path, but had further problems, for example, if my program imports __main__, it gets the wrong module. imp.load_module neatly solves all of them at once.

@Marius: this is exactly the reason I write posts like this! I had no idea runpy existed. In my case, I'm trying to support 2.3 and up, so I can't use it as is, but it will certainly help me in understanding possible twistiness in my approach.
[gravatar]
So Ned,
what words or phrases need to be in the doc entry for runpy so that a search by most people with the same problem would find runpy?

- Paddy.
[gravatar]
@Paddy, excellent question. At first my answer was going to be that I should do targeted searches of the std lib to find these things, but it turns out to be really difficult. Searching for python standard library run module main doesn't find it for me.

site:docs.python.org run module main doesn't find it in the first page either.

Also, when I look at standard modules that have the problem I was trying to solve, such as profile and trace, none of them use runpy to solve it.

Add a comment:

Ignore this:
Leave this empty:
Name is required. Either email or web are required. Email won't be displayed and I won't spam you. Your web site won't be indexed by search engines.
Don't put anything here:
Leave this empty:
Comment text is Markdown.