Symbol table
Before the compiler can emit a single opcode for a name, it has to
know what kind of name it is. LOAD_FAST for a local;
LOAD_DEREF for a closure cell or free variable; LOAD_GLOBAL
for a module global; LOAD_NAME for the ambiguous case in
class bodies and the REPL. The symbol table is the answer to "what
kind". Each scope (module, function, class, lambda, comprehension,
PEP 695 type-parameter scope) becomes one PySTEntryObject and
every name that appears in the source is classified.
Where the code lives
| File | Role | Entry points |
|---|---|---|
Python/symtable.c | The whole symbol table. Two passes: collect, then analyse. | _PySymtable_Build, symtable_analyze |
Include/internal/pycore_symtable.h | symtable, PySTEntryObject, scope and symbol flags. | struct symtable, PySTEntryObject |
The driver:
/* Python/symtable.c:413 _PySymtable_Build */
struct symtable *
_PySymtable_Build(mod_ty mod, PyObject *filename, _PyFutureFeatures *future);
_PySymtable_Build allocates the table, runs the collect pass, runs
the analyse pass, and returns the populated struct symtable. The
compiler reads from it once per scope through
compiler_enter_scope.
The two passes
The collect pass walks the AST top-down. For each scope it pushes
a fresh PySTEntryObject and recurses into the body. Each name
the scope mentions gets a bitmask of uses recorded in
ste->ste_symbols. A name on the left of = gets DEF_LOCAL; a
global x statement adds DEF_GLOBAL; a nonlocal x adds
DEF_NONLOCAL; an import x adds DEF_IMPORT; reading a name
adds USE.
/* Python/symtable.c symtable_visit_stmt */
static int symtable_visit_stmt(struct symtable *st, stmt_ty s);
static int symtable_visit_expr(struct symtable *st, expr_ty e);
The collect pass does not resolve names. It only records which names appear where and how each one is referenced.
The analyse pass runs after collection:
/* Python/symtable.c:1369 symtable_analyze */
static int symtable_analyze(struct symtable *st);
Analyse walks the scope tree from the outside in. For each scope it determines, for every recorded name, whether the name is local (defined here), free (used here but defined in an enclosing function), cell (defined here and used by an inner function), explicit global, or implicit global. The classification is encoded in the upper bits of the symbol value:
symbol bits 0..11: DEF_* flags from the collect pass
symbol bits 12..15: scope (LOCAL, GLOBAL_EXPLICIT, GLOBAL_IMPLICIT, FREE, CELL)
SYMBOL_TO_SCOPE(symbol) extracts the scope; _PyST_GetScope is
the convenience entry point.
The entry object
/* Include/internal/pycore_symtable.h:88 PySTEntryObject */
typedef struct _symtable_entry {
PyObject_HEAD
PyObject *ste_id;
PyObject *ste_symbols; /* dict: name -> int(flags|scope) */
PyObject *ste_name;
PyObject *ste_varnames; /* list: formal parameter names */
PyObject *ste_children; /* list: child PySTEntryObjects */
_Py_block_ty ste_type; /* ModuleBlock, FunctionBlock, ClassBlock, ... */
unsigned ste_generator : 1;
unsigned ste_coroutine : 1;
unsigned ste_nested : 1;
unsigned ste_comp_inlined : 1;
unsigned ste_method : 1;
unsigned ste_has_docstring : 1;
unsigned ste_returns_value : 1;
unsigned ste_varargs : 1;
unsigned ste_varkeywords : 1;
_Py_comprehension_ty ste_comprehension;
/* ... */
} PySTEntryObject;
Each entry is itself a PyObject (refcounted, garbage-collected)
because the compiler needs to keep them around per scope, and the
parent-child relationships create cycles the GC must handle.
Symbol flags
/* Include/internal/pycore_symtable.h:157 */
#define DEF_GLOBAL 1
#define DEF_LOCAL 2
#define DEF_PARAM (2<<1)
#define DEF_NONLOCAL (2<<2)
#define USE (2<<3)
#define DEF_FREE_CLASS (2<<5)
#define DEF_IMPORT (2<<6)
#define DEF_ANNOT (2<<7)
#define DEF_COMP_ITER (2<<8)
#define DEF_TYPE_PARAM (2<<9)
#define DEF_COMP_CELL (2<<10)
The flags compose. A name annotated and assigned in the same scope
gets DEF_LOCAL | DEF_ANNOT | USE. The analyse pass uses these
bits to choose a final scope: a name with DEF_LOCAL is a local
unless DEF_GLOBAL or DEF_NONLOCAL is also set; a name with
USE but no DEF_* is either free or implicit global, depending
on what enclosing scopes do with it.
Block types
/* Include/internal/pycore_symtable.h _Py_block_ty */
typedef enum _block_type {
FunctionBlock,
ClassBlock,
ModuleBlock,
AnnotationBlock,
TypeVariableBlock, /* PEP 695 */
TypeAliasBlock, /* PEP 695 */
TypeParametersBlock, /* PEP 695 */
} _Py_block_ty;
Class blocks are the odd one out: a class scope is not a function, so a name defined in a class body is not visible to nested functions defined inside it. The analyse pass enforces this by treating class blocks as opaque for free-variable resolution; a name used in a method must be resolved against the enclosing module, not the enclosing class.
Annotation blocks (PEP 649) and type-parameter blocks (PEP 695) are recent additions. Each gets its own analyse path because the visibility rules differ:
- Annotation blocks see the enclosing scope at evaluation time; they do not see other annotations in the same class body.
- Type-parameter blocks introduce a fresh scope above the function or class that holds the generic parameters as locals.
Resolution algorithm
The analyse pass walks the scope tree. For each scope it builds a
comp_free set (names used and not bound locally) and walks its
children with that set, then walks the children themselves.
The decision for a name n in scope s is roughly:
if DEF_GLOBAL in flags[n]:
scope = GLOBAL_EXPLICIT
elif DEF_NONLOCAL in flags[n]:
require enclosing function with a local binding for n
scope = FREE
elif DEF_LOCAL in flags[n]:
if n is used by some inner function:
scope = CELL
else:
scope = LOCAL
elif USE in flags[n]:
walk outwards, skipping class scopes
if some enclosing function has a binding for n:
scope = FREE
else:
scope = GLOBAL_IMPLICIT
comp_free propagates upwards: when an inner function uses a name
defined in an outer function, the outer function must mark the
binding as a cell (so its enclosing-function reference is
addressable via LOAD_DEREF from the inner). The analyse pass
makes a second walk to upgrade LOCAL to CELL for any name
whose definition is referenced by an inner free.
Comprehensions
PEP 709 inlined comprehensions in 3.12. The symbol table still
treats a comprehension as a nested scope for purposes of name
resolution (so the iteration variable does not leak), but the
compiler emits the body inline rather than as a nested code
object. The ste_comp_inlined flag on the parent entry records
whether a comprehension was inlined. Cells that the inlined
comprehension would have closed over instead become cells in the
parent, marked DEF_COMP_CELL.
PEP 695 type parameters
PEP 695 added a new generic syntax:
def foo[T: int](x: T) -> T: ...
class Container[T]:
items: list[T]
type Alias[T] = list[T]
Each generic introduces a TypeParametersBlock. The block holds
the type parameter names as locals; the function or class body is
a child block that sees the type parameters as free variables.
TypeVariableBlock is a further nested scope used for the bound
or constraint expressions on a type parameter, which need to see
the type parameters declared earlier in the same list.
Class-level free variables
Free variables in class methods that reference the class itself
(usually through super() or __class__) are handled with a
special __class__ cell. The class-body scope marks __class__
as a cell at the end of the analyse pass; every method that uses
super() becomes a free reference to that cell. This is the
mechanism that makes argumentless super() work.
The compiler's view
The compiler reads from the symbol table in two places:
- Scope selection.
compile_lookup_argchooses the right opcode for a name (LOAD_FAST,LOAD_GLOBAL,LOAD_DEREF,LOAD_NAME,LOAD_FROM_DICT_OR_DEREF,LOAD_FROM_DICT_OR_GLOBALS). - Slot layout. The compiler reads the entry's
ste_varnamesand the analysed scope information to determine how many fast locals, cells, and free slots the code object needs. This drives theco_nlocalsplusfield on thePyCodeObject.
PEP touchpoints
- PEP 526. Annotated assignments produce
DEF_ANNOT. - PEP 649. Annotation blocks defer evaluation; the symbol
table introduces
AnnotationBlockfor the lazy wrapper. - PEP 695.
TypeVariableBlock,TypeAliasBlock,TypeParametersBlock. - PEP 709. Inlined comprehensions;
ste_comp_inlinedandDEF_COMP_CELL.
Reference
Python/symtable.c,Include/internal/pycore_symtable.h.- PEP 227. Statically nested scopes.
- PEP 3104. Access to names in outer scopes.
- PEP 695. Type parameter syntax.
- PEP 709. Inlined comprehensions.