Skip to main content

Imports

import is a Python protocol implemented mostly in Python. importlib does the work; the C side is a thin entry that boots importlib from a frozen module and exposes a handful of hooks. The protocol is built around two abstractions. finders locate a module by name; loaders turn the found module into a populated ModuleType. The whole stack is configurable.

Where the code lives

FileRole
Python/import.cC-level entry points; sys.modules; the frozen importlib boot.
Lib/importlib/_bootstrap.pyCore import machinery. Frozen.
Lib/importlib/_bootstrap_external.pyFilesystem-backed finders and loaders. Frozen.
Python/frozen_modules/Pre-compiled .pyc images of bootstrap modules.
Modules/_importlib.h (generated)The frozen _bootstrap and _bootstrap_external bytecode.

The protocol

The Python-level entry is importlib.__import__. Its job, given a fully qualified name:

  1. Check sys.modules. If the name is present (and not None), return that.
  2. Walk sys.meta_path. Ask each meta-path finder find_spec(name, path, target). The first non-None result wins.
  3. The returned ModuleSpec carries a loader. Call loader.create_module(spec) (often returns None, meaning "default new module"), then loader.exec_module(module), which runs the module's code in the module's namespace.
  4. Install the module in sys.modules before executing its body so circular imports see a partially populated object.

sys.modules is the canonical cache; nothing else needs to be consulted on a hit.

Finders

sys.meta_path = [
BuiltinImporter, # built-in modules linked into the binary
FrozenImporter, # frozen modules (importlib boot, stdlib subset)
PathFinder, # filesystem and other path-based locations
]

Each is a class with find_spec(name, path, target).

  • BuiltinImporter checks the static table of built-in modules compiled into the interpreter (_io, sys, builtins, ...).
  • FrozenImporter checks the frozen-module table (everything in Python/frozen_modules/).
  • PathFinder walks sys.path (for top-level imports) or the parent package's __path__ (for submodules). It consults sys.path_hooks to find a path entry finder for each path.

PathFinder plus the default FileFinder is what handles 99% of imports.

Loaders

A loader is anything implementing exec_module(module) (and optionally create_module). The standard set:

LoaderSource
SourceFileLoader.py files (with .pyc caching).
SourcelessFileLoader.pyc files only.
ExtensionFileLoader.so / .pyd extension modules.
BuiltinImporterBuilt-in modules.
FrozenImporterFrozen modules.

SourceFileLoader.exec_module does:

  1. Read the .py source (or load .pyc cache if fresh).
  2. Compile to bytecode if no fresh .pyc exists.
  3. Execute the bytecode in module.__dict__.

The .pyc cache is keyed on a header (magic, flags, source mtime or hash, source size). A mismatch triggers a recompile.

The bootstrap problem

importlib._bootstrap is itself a Python module. It cannot be imported by the import system before the import system is running. The solution is frozen modules: at CPython build time, Tools/build/freeze_modules.py compiles a fixed set of Python modules to bytecode and embeds the result in the binary as C arrays. At startup:

  1. Python/import.c::_PyImport_BootstrapImp reads the embedded bytecode for _bootstrap.
  2. The frozen _bootstrap module is executed, defining __import__, ModuleSpec, BuiltinImporter, FrozenImporter.
  3. _bootstrap_external (also frozen) is loaded similarly to install PathFinder and the file-based loaders.
  4. From this point, import is fully functional and the C side delegates to Python.

The frozen subset includes the stdlib modules os, posixpath, io, zipimport, and a handful of others, because they are needed during startup before PathFinder can load anything from disk.

sys.modules

A dict mapping fully qualified names to module objects. The import system reads it first, populates it during a load, and removes failed entries on exec_module failure. Setting sys.modules[name] = None is the standard way to mark a name as deliberately unimportable.

Packages

A package is a module whose __path__ is a sequence of directories. Submodules are imported by PathFinder walking parent.__path__. There are two flavours:

  • Regular packages. A directory containing __init__.py. The init module runs when the package is first imported.
  • Namespace packages (PEP 420). A directory without __init__.py. Multiple directories can contribute submodules to the same package name; __path__ is a special _NamespacePath that updates as more locations are added to sys.path.

ModuleType and ModuleSpec

/* Objects/moduleobject.c */
typedef struct {
PyObject_HEAD
PyObject *md_dict; /* __dict__ */
struct PyModuleDef *md_def; /* PyModuleDef for C extensions */
void *md_state; /* per-module state for C extensions */
PyObject *md_weaklist;
PyObject *md_name;
} PyModuleObject;

ModuleSpec (Python-level) carries the metadata needed to load the module: name, loader, origin, parent, submodule search locations. After loading, the spec is attached to the module as module.__spec__.

Extension modules

C extensions can be single-phase or multi-phase:

  • Single-phase init. The traditional API. PyInit_foo() returns a fully populated PyModuleObject. Limitation: the module is shared across subinterpreters and cannot be safely reloaded.
  • Multi-phase init (PEP 489). PyInit_foo() returns a PyModuleDef with slots; the import system calls the slots in order to create and execute the module. Multi-phase modules can be loaded into per-interpreter state cleanly.

PEP 489 modules are the right path for subinterpreter support (PEP 684) and for free-threaded extensions (PEP 703).

Path-based finder caches

sys.path_importer_cache caches the (PathFinder-chosen) entry finder for each sys.path element. Repeated imports walk the cache, not the filesystem.

When sys.path or a package's __path__ changes, the affected cache entries become stale; the cache is invalidated lazily on import miss.

Hooks

  • sys.path_hooks: callables that turn a path-string into a path-entry finder.
  • sys.audit("import", ...): PEP 578 audit hook; fired on every import.
  • sys.modules is itself observable: assigning into it short-circuits future imports of that name.

CPython 3.14 changes

  • Lazy imports. A new opt-in mode (PEP 690 was withdrawn; 3.14 implements a narrower facility) defers execution of module bodies until first attribute access. The compiler emits marker opcodes; the import machinery installs a thunk module.
  • Frozen-module table expanded to cover more of the stdlib hot path, shaving startup time.
  • Multi-phase init mandatory for free-threaded extensions. Single-phase extensions force re-enable of the GIL on import unless they declare Py_mod_gil = Py_MOD_GIL_NOT_USED.

PEP touchpoints

  • PEP 328. Absolute and relative imports.
  • PEP 420. Namespace packages.
  • PEP 489. Multi-phase extension module init.
  • PEP 552. Hash-based .pyc files.
  • PEP 578. Runtime audit hooks (sys.audit).
  • PEP 703. Py_mod_gil opt-in for free-threaded compatibility.

Reference

  • Python/import.c, Lib/importlib/_bootstrap.py, Lib/importlib/_bootstrap_external.py, Python/frozen_modules/.
  • PEP 328, 420, 489, 552, 578.