Sitemap

Getting Started
Utilities
Spec Files
When Things Go Wrong
Standalone Executables
Python Archives
Analyzing Python Modules
An Import Framework

Bug Tracker

A modulefinder Replacement

[This is part of Installer release 5. It can also be downloaded separately.]

Module mf is modelled after iu (well, actually I wrote this one first, and it worked so well I created a real importer from it).

It also uses ImportDirectors and Owners to partition the import name space. Except for the fact that these return Module instances instead of real module objects, they are identical.

Instead of an ImportManager, mf has an ImportTracker managing things.

ImportTracker

ImportTracker can be called in two ways: analyze_one(name, importername=None) or analyze_r(name, importername=None). The second method does what modulefinder does - it recursively finds all the module names that importing name would cause to appear in sys.modules. The first method is non-recursive. This is useful, because it is the only way of answering the question "Who imports name?" But since it is somewhat unrealistic (very few real imports do not involve recursion), it deserves some explanation.

analyze_one()

When a name is imported, there are structural and dynamic effects. The dynamic effects are due to the execution of the top-level code in the module (or modules) that get imported. The structural effects have to do with whether the import is relative or absolute, and whether the name is a dotted name (if there are N dots in the name, then N+1 modules will be imported even without any code running).

The analyze_one method determines the structural effects, and defers the dynamic effects. For example, analyze_one("B.C", "A") could return ["B", "B.C"] or ["A.B", "A.B.C"] depending on whether the import turns out to be relative or absolute. In addition, ImportTracker's modules dict will have Module instances for them.

Module Classes

There are Module subclasses for builtins, extensions, packages and (normal) modules. Besides the normal module object attributes, they have an attribute imports. For packages and normal modules, imports is a list populated by scanning the code object (and therefor, the names in this list may be relative or absolute names - we don't know until they have been analyzed).

The highly astute will notice that there is a hole in analyze_one() here. The first thing that happens when B.C is being imported is that B is imported and it's top-level code executed. That top-level code can do various things so that when the import of B.C finally occurs, something completely different happens (from what a structural analysis would predict). But mf can handle this through it's hooks mechanism.

code scanning

Like modulefinder, mf scans the byte code of a module, looking for imports. In addition, mf will pick out a module's __all__ attribute, if it is built as a list of constant names. This means that if a package declares an __all__ list as a list of names, ImportTracker will track those names if asked to analyze package.*. The code scan also notes the occurance of __import__, exec and eval, and can issue warnings when they're found.

The code scanning also keeps track (as well as it can) of the context of an import. It recognizes when imports are found at the top-level, and when they are found inside definitions (deferred imports). Within that, it also tracks whether the import is inside a condition (conditional imports).

Hooks

In modulefinder, scanning the code takes the place of executing the code object. mf goes further and allows a module to be hooked (after it has been scanned, but before analyze_one is done with it). A hook is a module named hook-fullyqualifiedname in the hooks package. These modules should have one or more of the following three global names defined:

hiddenimports
a list of modules names (relative or absolute) that the module imports in some untrackable way.
attrs
a list of (name, value) pairs, (where value is normally meaningless).
hook(mod)
a function taking a Module instance and returning a Module instance (so it can modify or replace).

The first hook (hiddenimports) extends the list created by scanning the code. ExtensionModules, of course, don't get scanned, so this is the only way of recording any imports they do.

The second hook (attrs) exists mainly so that ImportTracker won't issue spurious warnings when the rightmost node in a dotted name turns out to be an attribute in a package module, instead of a missing submodule.

The callable hook exists for things like dynamic modification of a package's __path__ or perverse situations, like xml.__init__ replacing itself in sys.modules with _xmlplus.__init__. (It takes nine hook modules to properly trace through PyXML-using code, and I can't believe that it's any easier for the poor programmer using that package). As of Installer 5b5, the hook(mod) (if it exists) is called before looking at the others - that way it can, for example, test sys.version and adjust what's in hiddenimports.

[Download an example hooks package (as a zip or tar.gz file) - Installer 5 already contains these files.]

Warnings

ImportTracker has a getwarnings() method that returns all the warnings accumulated by the instance, and by the Module instances in its modules dict. Generally, it is ImportTracker who will accumulate the warnings generated during the structural phase, and Modules that will get the warnings generated during the code scan.

Note that by using a hook module, you can silence some particularly tiresome warnings, but not all of them.

Cross Reference

Once a full analysis (that is, an analyze_r) has been done, you can get a cross reference by using getxref(). This returns a list of tuples. Each tuple is (modulename, importers), where importers is a list of the (fully qualified) names of the modules importing modulename. Both the returned list and the importers list are sorted.

Usage

A simple example follows:

      >>> import mf
      >>> a = mf.ImportTracker()
      >>> a.analyze_r("os")
      ['os', 'sys', 'posixpath', 'nt', 'stat', 'string', 'strop', 
      're', 'pcre', 'ntpath', 'dospath', 'macpath', 'win32api', 
      'UserDict', 'copy', 'types', 'repr', 'tempfile'] 
      >>> a.analyze_one("os")
      ['os']
      >>> a.modules['string'].imports
      [('strop', 0, 0), ('strop.*', 0, 0), ('re', 1, 1)]
      >>>
      

The tuples in the imports list are (name, delayed, conditional).

      >>> for w in a.modules['string'].warnings: print w
      ...
      W: delayed  eval hack detected at line 359
      W: delayed  eval hack detected at line 389
      W: delayed  eval hack detected at line 418
      >>> for w in a.getwarnings(): print w
      ...
      W: no module named pwd (delayed, conditional import by posixpath)
      W: no module named dos (conditional import by os)
      W: no module named os2 (conditional import by os)
      W: no module named posix (conditional import by os)
      W: no module named mac (conditional import by os)
      W: no module named MACFS (delayed, conditional import by tempfile)
      W: no module named macfs (delayed, conditional import by tempfile)
      W: top-level conditional exec statment detected at line 47 
         - os (C:\Program Files\Python\Lib\os.py)
      W: delayed  eval hack detected at line 359 
         - string (C:\Program Files\Python\Lib\string.py)
      W: delayed  eval hack detected at line 389 
         - string (C:\Program Files\Python\Lib\string.py)
      W: delayed  eval hack detected at line 418 
         - string (C:\Program Files\Python\Lib\string.py)
      >>>
      

(The historically minded will note the antiquity of the Python used to demonstrate this.)

copyright 1999-2002
McMillan Enterprises, Inc.