pycore_global_strings.h
Pre-interned string storage for CPython's runtime. Every identifier or
special literal that the interpreter needs repeatedly (dunder names, keyword
strings, codec names) is stored once in _PyRuntime.cached_objects.strings
and accessed through the _Py_ID() or _Py_STR() macros. No allocation
happens at call sites; callers borrow a reference to a singleton that lives
for the lifetime of the interpreter.
Map
| Lines | Symbol | Role |
|---|---|---|
| 18–26 | STRUCT_FOR_ASCII_STR, STRUCT_FOR_STR, STRUCT_FOR_ID | Layout macros that embed a PyASCIIObject plus inline data |
| 31–823 | struct _Py_global_strings | Aggregate of literals and identifiers sub-structs, auto-generated |
| 33–56 | literals | Named special strings: <module>, utf-8, <lambda>, etc. |
| 58–814 | identifiers | Dunder names and other identifiers: __init__ through zstd_dict |
| 815–823 | ascii[128], latin1[128] | Fast single-character string table |
| 830–831 | _Py_ID(NAME) | Macro returning a borrowed PyObject* for a known identifier |
| 832–833 | _Py_STR(NAME) | Macro returning a borrowed PyObject* for a known literal |
| 834–837 | _Py_LATIN1_CHR(CH) | Macro returning a pre-interned single-character string |
| 849 | _Py_DECLARE_STR(name, str) | Documentation-only macro; expands to nothing |
Reading
Struct layout macros (lines 18–26)
Each string is stored as an anonymous struct embedding the full
PyASCIIObject header followed by inline character data. This lets the
linker place the string body immediately after its header with no extra
allocation.
// CPython: Include/internal/pycore_global_strings.h:18 STRUCT_FOR_ASCII_STR
#define STRUCT_FOR_ASCII_STR(LITERAL) \
struct { \
PyASCIIObject _ascii; \
uint8_t _data[sizeof(LITERAL)]; \
}
#define STRUCT_FOR_STR(NAME, LITERAL) \
STRUCT_FOR_ASCII_STR(LITERAL) _py_ ## NAME;
#define STRUCT_FOR_ID(NAME) \
STRUCT_FOR_ASCII_STR(#NAME) _py_ ## NAME;
STRUCT_FOR_ID stringifies NAME so the struct's _data array is
sized exactly to the identifier text including the NUL terminator.
Access macros (lines 830–837)
// CPython: Include/internal/pycore_global_strings.h:830 _Py_ID
#define _Py_ID(NAME) \
(_Py_SINGLETON(strings.identifiers._py_ ## NAME._ascii.ob_base))
#define _Py_STR(NAME) \
(_Py_SINGLETON(strings.literals._py_ ## NAME._ascii.ob_base))
#define _Py_LATIN1_CHR(CH) \
((CH) < 128 \
? (PyObject*)&_Py_SINGLETON(strings).ascii[(CH)] \
: (PyObject*)&_Py_SINGLETON(strings).latin1[(CH) - 128])
_Py_SINGLETON expands to _PyRuntime.cached_objects, so _Py_ID(__init__)
is a direct field access into the global runtime struct, not a hash-table
lookup.
Identifier list structure (lines 58–814)
The identifiers sub-struct is generated by
Tools/build/generate_global_objects.py. Each entry uses STRUCT_FOR_ID,
which stringifies the C token to produce the Python identifier string.
The list spans every dunder method (__abs__ through __xor__), all
co_* code-object attribute names, and common keyword-argument names.
Single-character tables (lines 815–823)
// CPython: Include/internal/pycore_global_strings.h:815 ascii
struct {
PyASCIIObject _ascii;
uint8_t _data[2];
} ascii[128];
struct {
PyCompactUnicodeObject _latin1;
uint8_t _data[2];
} latin1[128];
Characters 0–127 use PyASCIIObject; characters 128–255 use
PyCompactUnicodeObject because they require a Latin-1 kind flag.
_Py_LATIN1_CHR dispatches between the two arrays at compile time via
the constant branch.
gopy notes
gopy stores interned strings in objects/str.go using a Go sync.Map
keyed by string content. The _Py_ID pattern maps naturally to a
package-level var holding a pre-interned *StrObject, initialized in
an init() function. The single-character Latin-1 table corresponds to
the latin1 array in objects/str.go.
The _Py_DECLARE_STR macro (line 849) is a no-op in CPython; gopy has
no equivalent and does not need one.
CPython 3.14 changes
Python 3.12 introduced this header as a replacement for the older
_Py_Identifier linked-list mechanism. In 3.14 the identifier list was
extended with __annotate__, __annotate_func__,
__annotations_cache__, __conditional_annotations__,
__firstlineno__, __static_attributes__, _strptime_datetime_date,
_strptime_datetime_time, and zstd_dict, reflecting new language
features and stdlib additions in 3.12 through 3.14.