Python/ast.c
cpython 3.14 @ ab2d84fe1023/Python/ast.c
AST validation and docstring extraction. _PyAST_Validate is called
in debug builds after a successful parse to check invariants that the
PEG grammar cannot express: legal constant types, valid expression
contexts, unique keyword names in match patterns, and structural
constraints on comprehensions, walrus targets, and starred expressions.
_PyAST_GetDocString extracts the leading string literal from a
function, class, or module body.
The bulk of the file is a set of validate_* functions, one per AST
node family. They recurse through the tree and return 0 on the first
violation. This file has no effect on correct parses; it exists to
catch manually-constructed or malformed AST trees early, with a clear
error message instead of a crash in the compiler.
Map
| Lines | Symbol | Role | gopy |
|---|---|---|---|
| 14-51 | recursion guard macro, forward declarations | validate_stmts, validate_exprs, validate_patterns, validate_type_params forward decls plus recursion depth check. | compile/ast_validate.go |
| 52-156 | validate_name, validate_comprehension, validate_keywords, validate_args, validate_arguments | Structural validators for sub-nodes shared by multiple parent kinds. | compile/ast_validate.go |
| 157-208 | validate_constant | Verify that Constant.value is one of the legal Python constant types. | compile/ast_validate.go:validateConstant |
| 210-416 | validate_expr | Switch over all 37 expression kinds; checks context, targets, and structural invariants. | compile/ast_validate.go:validateExpr |
| 417-538 | ensure_literal_* helpers | Validate that match-case constant patterns contain only literal values. | compile/ast_validate.go |
| 539-701 | validate_capture, validate_pattern, validate_pattern_match_value | Match-statement pattern tree validation, including star_ok threading. | compile/ast_validate.go:validatePattern |
| 702-953 | validate_assignlist, validate_body, validate_stmt | Statement validation: switch over all statement kinds, check targets, bodies, and handlers. | compile/ast_validate.go:validateStmt |
| 954-1045 | validate_stmts, validate_exprs, validate_patterns, validate_typeparam, validate_type_params | Sequence validators and PEP 695 type parameter validation. | compile/ast_validate.go |
| 1047-1075 | _PyAST_Validate | Public entry. Dispatches on mod kind and drives the recursive walk. | compile/ast_validate.go:Validate |
| 1077-1091 | _PyAST_GetDocString | Extract body[0] if it is Expr(Constant(str)). | compile/codegen_stmt.go:getDocString |
Reading
validate_constant (lines 157 to 208)
cpython 3.14 @ ab2d84fe1023/Python/ast.c#L157-208
static int
validate_constant(struct validator *state, PyObject *value)
{
if (value == Py_None || value == Py_Ellipsis) {
return 1;
}
if (PyBool_Check(value)) {
return 1;
}
if (PyLong_CheckExact(value) || PyFloat_CheckExact(value) ||
PyComplex_CheckExact(value) || PyUnicode_CheckExact(value) ||
PyBytes_CheckExact(value)) {
return 1;
}
if (PyTuple_CheckExact(value)) {
Py_ssize_t i;
for (i = 0; i < PyTuple_GET_SIZE(value); i++) {
if (!validate_constant(state, PyTuple_GET_ITEM(value, i))) {
return 0;
}
}
return 1;
}
if (PyFrozenSet_Check(value)) {
...
return 1;
}
PyErr_Format(PyExc_SystemError,
"invalid constant value %R", value);
return 0;
}
The legal constant types are None, Ellipsis, bool (checked
before int because bool is a subclass of int), int, float,
complex, str, bytes, tuple (validated recursively), and
frozenset. Any other type raises SystemError with the repr of the
value. In practice this fires only when code constructs an ast.Constant
node by hand with an illegal value; the PEG parser never emits a bad
constant.
validate_expr (lines 210 to 416)
cpython 3.14 @ ab2d84fe1023/Python/ast.c#L210-416
The switch covers all expr_ty kinds. Key invariants enforced here:
Starredexpressions are only valid inDelorStorecontext;Loadcontext is rejected.NamedExpr(walrus) targets must haveStorecontext; the target must be a plainName.YieldandYieldFromare structural; the function checks they appear in a context that allows them (the context comes from the enclosingvalidate_stmtcall passing through the recursion).Lambdabodies must be a single expression without a return annotation;validate_argumentschecks the argument defaults.IfExp(ternary) validates all three sub-expressions.- Comprehension generators are checked via
validate_comprehension, which verifies theis_asyncflag and nestedifguards.
Most validation is structural rather than semantic. Type errors, undefined names, and scope issues are caught at runtime or by the symtable.
validate_pattern (lines 539 to 701)
cpython 3.14 @ ab2d84fe1023/Python/ast.c#L539-701
The match-statement pattern tree has its own validator separate from
validate_expr. Notable checks:
MatchMappingpatterns reject**restwhere the rest variable is named_(the anonymous wildcard). The keys list must not be empty and must contain only literal or attribute-access patterns.MatchClassrejects duplicate keyword argument names in the pattern. For example,case Foo(x=1, x=2)is caught here.MatchSequencethreads astar_okflag through its element list to ensure at most oneMatchStarappears; a second star raisesSyntaxError.MatchAswith aNonepattern is the wildcard (case _); any otherMatchAsmust have a capture name.
_PyAST_GetDocString (lines 1077 to 1091)
cpython 3.14 @ ab2d84fe1023/Python/ast.c#L1077-1091
PyObject *
_PyAST_GetDocString(asdl_stmt_seq *body)
{
if (!asdl_seq_LEN(body)) {
return NULL;
}
stmt_ty st = asdl_seq_GET(body, 0);
if (st->kind != Expr_kind) {
return NULL;
}
expr_ty e = st->v.Expr.value;
if (e->kind == Constant_kind && PyUnicode_CheckExact(e->v.Constant.value)) {
return e->v.Constant.value;
}
return NULL;
}
Checks that body[0] is an Expr statement containing a Constant
with a str value. Returns the string object (borrowed reference) or
NULL. Called from compile.c when entering a function or class body.
If a docstring is found, the compiler emits it as the first LOAD_CONST
and it becomes co_consts[0]. The compile.c caller is responsible
for the _PyCompile_CleanDoc call that strips leading indentation.
In the gopy port, getDocString lives in
compile/codegen_stmt.go and is called from the function and class
body emit path.
_PyAST_Validate (lines 1047 to 1075)
cpython 3.14 @ ab2d84fe1023/Python/ast.c#L1047-1075
The public entry. Initialises a struct validator (holds the current
recursion depth and a reference to the thread state), then dispatches
on mod->kind:
Module_kind: callsvalidate_stmtson the body.Interactive_kind: callsvalidate_stmtson the body.Expression_kind: callsvalidate_expron the single expression.FunctionType_kind: validates argument types and the return annotation.
Any validate_* call returning 0 surfaces a SystemError set by that
call. The recursion guard inside each validate_* function raises
RecursionError if the depth exceeds the interpreter's limit, matching
the behaviour of the compiler and eval loop.
In gopy, Validate in compile/ast_validate.go is gated by a build
tag so production builds skip the pass entirely.
Notes for the gopy mirror
compile/ast_validate.gois the direct port of thevalidate_*family. The file is compiled only under thedebugvalidatebuild tag.compile/codegen_stmt.go:getDocStringmirrors_PyAST_GetDocString. It returns the*ast.Constantvalue rather than aPyObject *.- The
ensure_literal_*helpers are inlined into the pattern validator in gopy rather than kept as separate functions. validate_typeparamandvalidate_type_params(PEP 695) are included in the port but only exercised when type-alias and generic-function ASTs are present.
CPython 3.14 changes worth noting
- PEP 695 (
validate_typeparam,validate_type_params, lines 1003-1045) was added in 3.12 and extended in 3.13. It validatesTypeVar,ParamSpec, andTypeVarTuplenodes insideTypeAliasand generic function/class definitions. - The
MatchClassduplicate-keyword check (insidevalidate_pattern) was added as a bug fix in 3.12 and is present in all later versions. validate_constantgainedPyFrozenSet_Checksupport in 3.10 alongside thematchstatement; the check is unchanged through 3.14.