1600. gopy: Porting CPython's Python/ to Go
Goal
gopy is a fresh re-implementation of CPython's interpreter core in Go. The
target is 100% behavioural compatibility with the upstream CPython 3.14-era
sources at $HOME/github/python/cpython. That means same data structures,
same models, same code logic, same wire formats, same error messages. The
only change is naming and surface API style, which adopts Go-idiomatic
conventions modelled on the Go standard library.
This is not a clean-room reimagining. It is a line-by-line port. When
behaviour deviates from CPython, the bug is in the port, not in CPython.
The CPython source-of-truth folder is cpython/Python/ (about 138k lines of
C across 91 .c files plus ~30 .h files). The Go target is tamnd/gopy,
currently at v0.9.0; v0.10 is in flight on feat/v0.10.0-gc.
Non-goals
- No new features. No improved API. No "better" GC.
- No Python 2 support.
- No alternative implementations (we are not PyPy / Cinder / GraalPython).
- No partial Python. The goal is to run unmodified CPython 3.14 stdlib.
- C extension compatibility (PyObject* ABI) is out of scope. We will not
load
.so modules. C extensions are reimplemented in Go on demand.
Sources of truth
| Concern |
Source |
| Runtime semantics |
cpython/Python/*.c, cpython/Include/internal/*.h |
| Object semantics |
cpython/Objects/*.c |
| Parser / lexer |
cpython/Parser/* |
| Stdlib |
cpython/Lib/* |
| Tests |
cpython/Lib/test/* |
| Spec authority for ambiguity |
the C source, not the docs |
This 1600-series covers cpython/Python/, cpython/Objects/, and
cpython/Parser/. The Objects port lives in the 1670-1689 sub-block
(formerly numbered 1700-series, renumbered to keep one folder).
The Parser port lives in the 1640-1645 sub-block. Stdlib ports
are tracked in their own spec series.
High-level architecture
┌────────────────────────────────────────────────┐
│ gopy/cmd/gopy │
│ (entry point, like python.c) │
└────────────────────┬───────────────────────────┘
│
┌────────────────────▼───────────────────────────┐
│ gopy/lifecycle │
│ Initialize / Finalize / NewInterp │
└─┬──────────────┬──────────────┬────────────────┘
│ │ │
┌─────────▼──┐ ┌──────▼──────┐ ┌────▼────────┐
│ initconfig │ │ imp │ │ pythonrun │
│ preconfig │ │ importlib │ │ REPL/eval │
│ pathconfig │ │ marshal │ │ pyc rd/wr │
└────────────┘ └──────┬──────┘ └────┬────────┘
│ │
┌────────────────▼──────────────▼─────────────────┐
│ gopy/state │
│ Runtime · Interpreter · Thread · CrossInterp │
└─┬──────────────┬─────────────────┬──────────────┘
│ │ │
┌──────────▼─┐ ┌───────▼──────┐ ┌──────▼─────────┐
│ gopy/ │ │ gopy/ │ │ gopy/ │
│ vm │ │ compile │ │ gc │
│ (ceval, │ │ ast/sym/ │ │ cycle coll, │
│ frame, │ │ codegen/ │ │ refcount, │
│ uops) │ │ flowgraph/ │ │ arena, brc, │
│ │ │ assemble │ │ qsbr, weakref │
└──┬─────────┘ └──┬───────────┘ └────────────────┘
│ │
┌──────▼────────┐ ┌─────▼──────────┐
│ gopy/ │ │ gopy/ │
│ specialize │ │ tokenize │
│ optimizer │ │ parser (sep.)│
│ jit (deferred│ └────────────────┘
│ monitor │
└───────────────┘
Cross-cutting: gopy/pysync, gopy/hash, gopy/pytime,
gopy/format, gopy/pystrconv, gopy/codecs,
gopy/errors, gopy/traceback, gopy/warnings,
gopy/contextvar, gopy/hamt, gopy/hashtable,
gopy/intrinsics, gopy/structmember, gopy/getargs,
gopy/modsupport, gopy/builtin, gopy/sysmod,
gopy/tracemalloc, gopy/audit, gopy/monitor
Spec files in this series
Implemented (spec written and code shipped)
| # |
File |
Focus |
Shipped |
| 1600 |
1600_gopy_overview.md |
This file |
meta |
| 1601 |
1601_gopy_naming.md |
Naming conventions: C to Go translation rules |
meta |
| 1602 |
1602_gopy_filemap.md |
C source to Go package mapping (every file) |
meta |
| 1603 |
1603_gopy_roadmap.md |
Phased milestone plan v0.1 to v1.0 |
meta |
| 1630 |
1630_gopy_vm_overview.md |
VM block overview (Tier-1 interpreter) |
meta |
| 1640 |
1640_gopy_parser_overview.md |
Parser block overview |
meta |
v0.1 — arena and sync
| # |
File |
Focus |
Shipped |
| 1604 |
1604_gopy_arena.md |
pyarena.c port |
v0.1 |
| 1605 |
1605_gopy_pythread.md |
thread.c cross-platform port |
v0.1 |
| 1606 |
1606_gopy_pysync.md |
lock.c, parking_lot.c, critical_section.c |
v0.1 |
| 1607 |
1607_gopy_hashsecret.md |
bootstrap_hash.c seed init |
v0.1 |
v0.3 — errors and traceback
| # |
File |
Focus |
Shipped |
| 1611 |
1611_gopy_errors.md |
errors.c plus the BaseException gating subset |
v0.3 |
v0.4 — strings, numbers, hash
| # |
File |
Focus |
Shipped |
| 1660 |
1660_gopy_strings_numbers.md |
pyctype, pystrcmp, mystrtoul, pystrtod, dtoa, pystrhex, pymath, pyfpe, formatter_unicode |
v0.4 |
| 1661 |
1661_gopy_hash.md |
pyhash.c (SipHash-1-3, FNV-1a) |
v0.4 |
v0.5 / v0.5.5 — compiler and parser
| # |
File |
Focus |
Shipped |
| 1620 |
1620_gopy_compile_pipeline.md |
ast, asdl, future, symtable, codegen, flowgraph, assemble, compile, instruction_sequence, ast_preprocess, ast_unparse |
v0.5 |
| 1625 |
1625_gopy_compile_testing.md |
Per-checkbox test plan for 1620 and 1665 |
v0.5 |
| 1626 |
1626_gopy_codegen.md |
codegen.c port detail |
v0.5 |
| 1627 |
1627_gopy_flowgraph.md |
flowgraph.c port detail (CFG, passes; stackdepth + super-instr deferred) |
v0.5 |
| 1628 |
1628_gopy_assemble.md |
assemble.c port detail |
v0.5 |
| 1629 |
1629_gopy_compile_goldens.md |
Disassembly golden corpus for v05test |
v0.5 |
| 1641 |
1641_gopy_lexer_tokenizer.md |
Parser/lexer/, Parser/tokenizer/ |
v0.5.5 |
| 1642 |
1642_gopy_pegen.md |
pegen.c, parser.c, generated PEG runtime |
v0.5.5 |
| 1643 |
1643_gopy_parser_errors.md |
pegen_errors.c, action_helpers.c, peg_api.c, token.c |
v0.5.5 |
| 1644 |
1644_gopy_string_parser.md |
string_parser.c (f-string, t-string, bytes) |
v0.5.5 |
v0.6 — VM Tier-1
| # |
File |
Focus |
Shipped |
| 1621 |
1621_gopy_bytecodes_dsl.md |
bytecodes.c DSL parser + Go-emitting generator |
v0.6 |
| 1635 |
1635_gopy_intrinsics.md |
intrinsics.c (CALL_INTRINSIC_1 / 2 dispatch) |
v0.6 |
| 1636 |
1636_gopy_eval_loop.md |
ceval.c, ceval_macros.h, opcode dispatch loop |
v0.6 |
| 1637 |
1637_gopy_frame.md |
frame.c, frame layout, locals, generator state |
v0.6 |
| 1638 |
1638_gopy_stackref.md |
stackrefs.c, tagged stack values |
v0.6 |
| 1639 |
1639_gopy_eval_gil.md |
ceval_gil.c, GIL, eval breaker, signal bridge |
v0.6 |
v0.7 — lifecycle, sys, builtins, warnings
| # |
File |
Focus |
Shipped |
| 1622 |
1622_gopy_lifecycle.md |
pylifecycle, preconfig, initconfig, pathconfig |
v0.7 |
| 1624 |
1624_gopy_pythonrun.md |
RunString / RunFile / REPL |
v0.7 |
| 1651 |
1651_gopy_modules.md |
builtins, sys, _warnings subsets |
v0.7 |
v0.8 — marshal, import, codecs; Module and set objects
| # |
File |
Focus |
Shipped |
| 1681 |
1681_gopy_set.md |
setobject.c (set, frozenset) |
v0.8 |
| 1686 |
1686_gopy_exceptions.md |
exceptions.c — ImportError / ModuleNotFoundError hierarchy |
v0.8 |
| 1688 |
1688_gopy_module_misc.md |
moduleobject.c (Module type, name / doc / file / loader / spec) |
v0.8 |
| 1690 |
1690_gopy_marshal.md |
marshal.c — TYPE_LONG, FLAG_REF, TYPE_CODE, TYPE_SET, TYPE_DICT, TYPE_COMPLEX, .pyc header (PEP 552) |
v0.8 |
| 1691 |
1691_gopy_import.md |
import.c, frozen.c — sys.modules cache, inittab, frozen table, ExecCodeModule, source/.pyc loaders, ImportModuleLevel, IMPORT_NAME/FROM |
v0.8 |
| 1692 |
1692_gopy_codecs.md |
codecs.c — registry, error handlers, built-in utf-8 / ascii / latin-1 codecs |
v0.8 |
Written, partial scaffold (spec written, some code shipped, full panel pending)
| # |
File |
Focus |
Phase |
| 1665 |
1665_gopy_tokenize.md |
Python-tokenize.c public iterator surface |
v0.5 / v0.9 |
| 1670 |
1670_gopy_objects_overview.md |
Objects block overview (1670-1689) |
meta |
| 1671 |
1671_gopy_object_protocol.md |
Object interface, Header, VarHeader, refcount |
v0.2 |
| 1672 |
1672_gopy_type.md |
Type, slots, MRO, lookup |
v0.2 |
| 1683 |
1683_gopy_abstract.md |
abstract.c subset (PyObject_, PyNumber_) |
v0.2+ |
Written, pending implementation
v0.9 — contextvars, time, remaining VM bytecodes, runtime helpers (shipped)
Tag v0.9.0 published 2026-05-06. Tracker rows kept here for the
file-by-file map; full release notes live in changelog/v0.9.0.md.
| # |
File |
Focus |
Status |
Phase |
| 1634 |
1634_gopy_monitor.md |
sys.monitoring + sys.settrace / setprofile |
W |
v0.9+ |
| 1645 |
1645_gopy_myreadline.md |
myreadline.c, interactive readline editing |
W |
v0.9+ |
| 1662 |
1662_gopy_hamt.md |
hamt.c, HAMT backing store for contextvars |
S |
v0.9 |
| 1663 |
1663_gopy_context.md |
context.c, _contextvars.c, PEP 567 contextvars |
S |
v0.9 |
| 1664 |
1664_gopy_time.md |
pytime.c, monotonic clock, conversions, deadline math |
S |
v0.9 |
| 1668 |
1668_gopy_runtime_helpers.md |
getopt.c CLI option parser plus hashtable.c generic table |
S |
v0.9 |
| 1693 |
1693_gopy_vm_remaining.md |
IMPORT_, RETURN_GENERATOR / YIELD / SEND, MATCH_, WITH_EXCEPT_START, BUILD_SET / SET_ADD |
S |
v0.9 |
v0.10 — cycle GC, weakrefs, finalizers (in flight)
Branch feat/v0.10.0-gc. Spec status legend: W = spec written,
no code. C = code shipped, tests pending. S = code + tests shipped.
| # |
File |
Focus |
Status |
Phase |
| 1613 |
1613_gopy_gc.md |
gc.c full collector (generations, weakref clearing, finalizer queue) plus gc_gil.c, object_stack.c |
W |
v0.10 |
| 1666 |
1666_gopy_tracemalloc.md |
allocation tracing |
W |
v0.10 |
| 1689 |
1689_gopy_obj_misc.md |
weakrefobject.c rows pulled forward to feed cycle clearing |
W |
v0.10 |
v0.11+ — specialization, optimizer, debug
| # |
File |
Focus |
Phase |
| 1631 |
1631_gopy_specialize.md |
PEP 659 adaptive specialization |
v0.11 |
| 1632 |
1632_gopy_optimizer.md |
Tier-2 trace projector + abstract interp |
v0.12 |
| 1633 |
1633_gopy_jit.md |
Copy-and-patch JIT (deferred) |
post-v1.0 |
| 1667 |
1667_gopy_remote_debug.md |
remote debugging hooks |
v0.13 |
Objects block — pending (code lands incrementally v0.2-v0.9)
| # |
File |
Focus |
Phase |
| 1673 |
1673_gopy_long.md |
longobject.c (PyLong, small-int cache) |
v0.2 |
| 1674 |
1674_gopy_float_complex.md |
floatobject.c (v0.2), complexobject.c (v0.6) |
v0.2 / v0.6 |
| 1675 |
1675_gopy_bool_none.md |
boolobject.c, None, NotImplemented, Ellipsis |
v0.2 |
| 1676 |
1676_gopy_bytes.md |
bytesobject.c, bytearrayobject.c, bytes_methods.c |
v0.4 |
| 1677 |
1677_gopy_unicode.md |
unicodeobject.c, unicodectype.c |
v0.4 |
| 1678 |
1678_gopy_tuple.md |
tupleobject.c, empty-tuple singleton |
v0.2 |
| 1679 |
1679_gopy_list.md |
listobject.c, list_resize curve, Timsort |
v0.2 |
| 1680 |
1680_gopy_dict.md |
dictobject.c, odictobject.c |
v0.2 |
| 1682 |
1682_gopy_slice_range.md |
sliceobject.c, rangeobject.c |
v0.2 |
| 1684 |
1684_gopy_call.md |
call.c, vectorcall |
v0.6 |
| 1685 |
1685_gopy_descr_method.md |
descrobject.c, methodobject.c, classobject.c, funcobject.c |
v0.4 / v0.6 |
| 1687 |
1687_gopy_code_frame_gen.md |
codeobject.c, frameobject.c, genobject.c, cellobject.c |
v0.5.5 / v0.6 |
| 1689 |
1689_gopy_obj_misc.md |
weakref, memoryview, typevar, union, GenericAlias, Interpolation, Template, obmalloc |
v0.9+ |
Reserved (spec not yet written)
| # |
File (planned) |
Focus |
Phase |
| 1612 |
1612_gopy_traceback.md |
traceback.c data and formatting |
v0.3 (retro) |
| 1614 |
1614_gopy_brc.md |
brc.c biased refcount field layout |
v0.3+ |
| 1615 |
1615_gopy_state.md |
pystate.c Runtime / Interpreter / Thread |
v0.3+ |
| 1698 |
1698_gopy_quirks.md |
Cross-cutting quirks the porter must preserve |
meta |
| 1699 |
1699_gopy_glossary.md |
Glossary: C term to Go term mapping |
meta |
Compatibility floors (what "100% compatible" means in practice)
The port is graded on the following observable surfaces. Each must match
CPython byte-for-byte, except where noted:
- Bytecode: same opcode numbers, same oparg encoding, same EXTENDED_ARG
widening, same exception table format, same line-number table
(
co_linetable) format, same cache layout. dis.dis(f) output identical.
- Marshal:
marshal.dumps(obj) produces identical bytes for the same
object graph. .pyc files produced by gopy are loadable by CPython and
vice versa, including version-magic-number compatibility.
- Hash: SipHash-1-3 with the same key-derivation from the seed,
producing identical
hash(x) for str/bytes/numeric. (PYTHONHASHSEED=0
gives deterministic match.)
- Eval semantics: every observable behaviour of
eval + exec matches:
exception types, exception messages (string-equal), traceback frame order,
__cause__/__context__ chains, generator state, async iteration order.
- Built-in module attributes:
sys.flags, sys.implementation.cache_tag
(gopy uses its own cache tag, see Quirks), sys.version_info, and
sys.path semantics.
- Import:
importlib._bootstrap runs to completion. import foo finds
modules by the same rules. __pycache__ layout is identical.
- Repr / format:
repr(obj) and format(obj, spec) produce identical
strings for builtins. Float repr uses shortest-roundtrip dtoa.
- Error messages: exception constructors produce identical
str(exc)
for identical inputs. (This is a high bar but a non-negotiable test
target.)
Items where we intentionally diverge (recorded in 1698_gopy_quirks.md):
sys.implementation.name is "gopy", not "cpython".
sys.implementation.cache_tag is "gopy-3140" so .pyc files do not collide.
gc.is_finalized and friends behave per CPython, but the underlying
mechanism uses Go's GC plus an emulated refcount/cycle layer (see 1613).
- C extension loading (
importlib.machinery.ExtensionFileLoader) is disabled
by default; only Go-native extension modules load.
Test strategy
- The CPython test suite (
Lib/test/) is the reference oracle.
- Phase 0 ships a "smoke" subset:
test_grammar, test_builtin, test_dis,
test_marshal, test_compile, test_dict, test_list, test_int,
test_str, test_exceptions. Once these pass, broaden.
- Bytecode-level tests: dis(f) round-trip equivalence between
gopy and
reference CPython, executed in CI.
- Hash-stability tests with
PYTHONHASHSEED=0.
- A
compat/ subdirectory at the gopy root holds CPython-cross tests that run
the same Python source under both runtimes and diff outputs.