Imports
import is a Python protocol implemented mostly in Python.
importlib does the work; the C side is a thin entry that boots
importlib from a frozen module and exposes a handful of hooks.
The protocol is built around two abstractions. finders locate
a module by name; loaders turn the found module into a
populated ModuleType. The whole stack is configurable.
Where the code lives
| File | Role |
|---|---|
Python/import.c | C-level entry points; sys.modules; the frozen importlib boot. |
Lib/importlib/_bootstrap.py | Core import machinery. Frozen. |
Lib/importlib/_bootstrap_external.py | Filesystem-backed finders and loaders. Frozen. |
Python/frozen_modules/ | Pre-compiled .pyc images of bootstrap modules. |
Modules/_importlib.h (generated) | The frozen _bootstrap and _bootstrap_external bytecode. |
The protocol
The Python-level entry is importlib.__import__. Its job, given
a fully qualified name:
- Check
sys.modules. If the name is present (and not None), return that. - Walk
sys.meta_path. Ask each meta-path finderfind_spec(name, path, target). The first non-None result wins. - The returned
ModuleSpeccarries a loader. Callloader.create_module(spec)(often returns None, meaning "default new module"), thenloader.exec_module(module), which runs the module's code in the module's namespace. - Install the module in
sys.modulesbefore executing its body so circular imports see a partially populated object.
sys.modules is the canonical cache; nothing else needs to be
consulted on a hit.
Finders
sys.meta_path = [
BuiltinImporter, # built-in modules linked into the binary
FrozenImporter, # frozen modules (importlib boot, stdlib subset)
PathFinder, # filesystem and other path-based locations
]
Each is a class with find_spec(name, path, target).
BuiltinImporterchecks the static table of built-in modules compiled into the interpreter (_io,sys,builtins, ...).FrozenImporterchecks the frozen-module table (everything inPython/frozen_modules/).PathFinderwalkssys.path(for top-level imports) or the parent package's__path__(for submodules). It consultssys.path_hooksto find a path entry finder for each path.
PathFinder plus the default FileFinder is what handles 99%
of imports.
Loaders
A loader is anything implementing exec_module(module) (and
optionally create_module). The standard set:
| Loader | Source |
|---|---|
SourceFileLoader | .py files (with .pyc caching). |
SourcelessFileLoader | .pyc files only. |
ExtensionFileLoader | .so / .pyd extension modules. |
BuiltinImporter | Built-in modules. |
FrozenImporter | Frozen modules. |
SourceFileLoader.exec_module does:
- Read the
.pysource (or load.pyccache if fresh). - Compile to bytecode if no fresh
.pycexists. - Execute the bytecode in
module.__dict__.
The .pyc cache is keyed on a header (magic, flags, source
mtime or hash, source size). A mismatch triggers a recompile.
The bootstrap problem
importlib._bootstrap is itself a Python module. It cannot be
imported by the import system before the import system is
running. The solution is frozen modules: at CPython build time,
Tools/build/freeze_modules.py compiles a fixed set of Python
modules to bytecode and embeds the result in the binary as C
arrays. At startup:
Python/import.c::_PyImport_BootstrapImpreads the embedded bytecode for_bootstrap.- The frozen
_bootstrapmodule is executed, defining__import__,ModuleSpec,BuiltinImporter,FrozenImporter. _bootstrap_external(also frozen) is loaded similarly to installPathFinderand the file-based loaders.- From this point,
importis fully functional and the C side delegates to Python.
The frozen subset includes the stdlib modules os, posixpath,
io, zipimport, and a handful of others, because they are
needed during startup before PathFinder can load anything from
disk.
sys.modules
A dict mapping fully qualified names to module objects. The
import system reads it first, populates it during a load, and
removes failed entries on exec_module failure. Setting
sys.modules[name] = None is the standard way to mark a name as
deliberately unimportable.
Packages
A package is a module whose __path__ is a sequence of
directories. Submodules are imported by PathFinder walking
parent.__path__. There are two flavours:
- Regular packages. A directory containing
__init__.py. The init module runs when the package is first imported. - Namespace packages (PEP 420). A directory without
__init__.py. Multiple directories can contribute submodules to the same package name;__path__is a special_NamespacePaththat updates as more locations are added tosys.path.
ModuleType and ModuleSpec
/* Objects/moduleobject.c */
typedef struct {
PyObject_HEAD
PyObject *md_dict; /* __dict__ */
struct PyModuleDef *md_def; /* PyModuleDef for C extensions */
void *md_state; /* per-module state for C extensions */
PyObject *md_weaklist;
PyObject *md_name;
} PyModuleObject;
ModuleSpec (Python-level) carries the metadata needed to load
the module: name, loader, origin, parent, submodule search
locations. After loading, the spec is attached to the module as
module.__spec__.
Extension modules
C extensions can be single-phase or multi-phase:
- Single-phase init. The traditional API.
PyInit_foo()returns a fully populatedPyModuleObject. Limitation: the module is shared across subinterpreters and cannot be safely reloaded. - Multi-phase init (PEP 489).
PyInit_foo()returns aPyModuleDefwith slots; the import system calls the slots in order to create and execute the module. Multi-phase modules can be loaded into per-interpreter state cleanly.
PEP 489 modules are the right path for subinterpreter support (PEP 684) and for free-threaded extensions (PEP 703).
Path-based finder caches
sys.path_importer_cache caches the (PathFinder-chosen) entry
finder for each sys.path element. Repeated imports walk the
cache, not the filesystem.
When sys.path or a package's __path__ changes, the affected
cache entries become stale; the cache is invalidated lazily on
import miss.
Hooks
sys.path_hooks: callables that turn a path-string into a path-entry finder.sys.audit("import", ...): PEP 578 audit hook; fired on every import.sys.modulesis itself observable: assigning into it short-circuits future imports of that name.
CPython 3.14 changes
- Lazy imports. A new opt-in mode (PEP 690 was withdrawn; 3.14 implements a narrower facility) defers execution of module bodies until first attribute access. The compiler emits marker opcodes; the import machinery installs a thunk module.
- Frozen-module table expanded to cover more of the stdlib hot path, shaving startup time.
- Multi-phase init mandatory for free-threaded extensions.
Single-phase extensions force re-enable of the GIL on import
unless they declare
Py_mod_gil = Py_MOD_GIL_NOT_USED.
PEP touchpoints
- PEP 328. Absolute and relative imports.
- PEP 420. Namespace packages.
- PEP 489. Multi-phase extension module init.
- PEP 552. Hash-based
.pycfiles. - PEP 578. Runtime audit hooks (
sys.audit). - PEP 703.
Py_mod_gilopt-in for free-threaded compatibility.
Reference
Python/import.c,Lib/importlib/_bootstrap.py,Lib/importlib/_bootstrap_external.py,Python/frozen_modules/.- PEP 328, 420, 489, 552, 578.