89. Making a CPython Patch
GitHub workflow: forking, branching, opening a PR, responding to review, and the CLA requirement.
89. Making a CPython Patch
A CPython patch is a coordinated change to the interpreter, runtime, standard library, tests, documentation, and development metadata. Even a small fix may affect object lifetime, bytecode behavior, platform compatibility, import state, or public APIs.
Making a good CPython patch requires more than changing code. The patch must preserve runtime invariants, include regression tests, follow repository conventions, build correctly across platforms, and explain user-visible behavior clearly.
89.1 Understand the Existing Behavior First
Before changing code, reproduce and understand the current behavior.
For a bug:
reproduce the failure
reduce to a minimal case
identify expected behavior
find where the incorrect behavior begins
For a feature:
understand current semantics
read related implementation paths
inspect tests
inspect documentation
identify compatibility constraints
A patch created before understanding the surrounding subsystem usually introduces secondary bugs.
89.2 Build CPython Locally
Always work from a local build.
Typical setup:
git clone https://github.com/python/cpython.git
cd cpython
./configure --with-pydebug
make -j8
Run:
./python
Do not use the system Python while modifying CPython internals.
The local interpreter ensures:
correct runtime
correct stdlib
correct extension modules
correct ABI
correct bytecode
89.3 Create a Branch
Use a dedicated branch.
Example:
git checkout -b fix-dict-resize
Good branch names are short and descriptive.
Examples:
fix-gc-cycle-crash
optimize-vectorcall-path
improve-import-error-message
add-test-for-recursion-limit
Avoid generic names such as:
patch
changes
work
update
89.4 Find the Relevant Code
CPython is large. Start by locating the relevant subsystem.
Useful directories:
| Directory | Purpose |
|---|---|
Objects/ |
Built-in object implementations |
Python/ |
Interpreter and compiler |
Parser/ |
Parsing |
Modules/ |
Built-in and extension modules |
Lib/ |
Standard library |
Include/ |
Public and internal headers |
Doc/ |
Documentation |
Lib/test/ |
Tests |
Examples:
| Problem | Likely area |
|---|---|
| Dict behavior | Objects/dictobject.c |
| List behavior | Objects/listobject.c |
| Bytecode execution | Python/ceval.c |
| Compiler behavior | Python/compile.c |
| Import logic | Python/import.c, Lib/importlib/ |
| GC behavior | Modules/gcmodule.c |
| Unicode internals | Objects/unicodeobject.c |
Use:
git grep keyword
Example:
git grep PyObject_GC_Track
89.5 Read Existing Tests First
Before writing new code, inspect existing tests.
Example:
grep -R "dict" Lib/test/test_dict.py
Existing tests show:
expected behavior
edge cases
historical regressions
platform assumptions
testing conventions
Often the correct patch location becomes clearer after reading tests.
89.6 Write the Test Before the Fix
For regressions, write the failing test first.
Example workflow:
reproduce bug
write minimal failing test
confirm failure
implement fix
confirm success
A regression test should fail before the patch and pass after it.
Example:
def test_resize_preserves_items(self):
d = {}
for i in range(1000):
d[i] = i
for i in range(1000):
self.assertEqual(d[i], i)
Keep the test minimal. It should isolate the broken behavior.
89.7 Make Small Changes First
Prefer the smallest correct change.
Bad approach:
rewrite large subsystem
refactor unrelated code
rename many symbols
change formatting everywhere
fix bug simultaneously
Better:
small targeted fix
minimal supporting cleanup
focused regression test
Small patches are easier to:
review
debug
backport
bisect
revert
reason about
Large unrelated cleanup should usually be separate.
89.8 Preserve Existing Invariants
CPython internals depend on many invariants.
Examples:
valid reference counts
correct GC tracking state
exception set on NULL return
borrowed references remain valid
matching alloc/free domains
stable frame state during execution
When modifying runtime code, explicitly ask:
Who owns this reference?
Can this object be collected here?
Can this API fail?
What happens on error cleanup?
Is this object tracked by GC?
Can another thread observe this state?
Most CPython bugs come from violating hidden assumptions.
89.9 Rebuild Frequently
After changing C code:
make -j8
Then run focused tests immediately.
Do not make many unrelated changes before rebuilding. Early failures are easier to diagnose.
89.10 Run Focused Tests First
After a small change:
./python -m test -v test_dict
or:
./python -m test -v test_gc
Use:
-x
to stop on first failure:
./python -m test -v -x test_gc
Fast iteration matters more than full-suite execution during early development.
89.11 Run Related Tests
After focused tests pass, run nearby tests.
Example:
./python -m test -v test_dict test_set test_collections
Subsystem interactions matter.
A dict change may affect:
keyword arguments
class namespaces
globals
attribute dictionaries
import machinery
dataclasses
JSON behavior
89.12 Run Reference Leak Tests
If touching object lifetime or C code:
./python -m test -R 3:3 test_name
Typical leak causes:
missing Py_DECREF
incorrect error cleanup
cached references
reference cycles
forgotten decref after ownership transfer
A patch that introduces leaks is incomplete.
89.13 Use a Debug Build
Always test internals work under:
./configure --with-pydebug
Debug builds expose:
assertion failures
GC inconsistencies
negative refcounts
allocator misuse
invalid object state
Release builds may hide these problems temporarily.
89.14 Use Sanitizers for Memory Bugs
For suspicious memory behavior:
./configure --with-pydebug \
CFLAGS="-O1 -g -fsanitize=address,undefined" \
LDFLAGS="-fsanitize=address,undefined"
Run:
ASAN_OPTIONS=abort_on_error=1:symbolize=1 \
./python -m test -v test_name
Sanitizers catch:
use-after-free
buffer overflow
invalid memory access
undefined behavior
89.15 Add Documentation Changes
User-visible behavior changes require documentation updates.
Examples:
| Change | Documentation |
|---|---|
| New stdlib behavior | Doc/library/ |
| New syntax | Doc/reference/ |
| New C API | Doc/c-api/ |
| Changed CLI flag | Doc/using/cmdline.rst |
| Important feature | Doc/whatsnew/ |
Build documentation locally:
make -C Doc html
make -C Doc suspicious
Documentation is part of the patch, not a later cleanup step.
89.16 Add a News Entry
Most user-visible changes need a news entry.
Typical command:
blurb add
Good entry:
Fix ``dict.update()`` incorrectly overwriting values when the source mapping mutates during iteration.
Weak entry:
Fix bug in dict.
The entry should describe the visible effect.
89.17 Keep Style Consistent
Follow surrounding style.
Examples:
indentation
brace placement
error handling patterns
goto cleanup conventions
naming
comment style
macro usage
Do not rewrite style in unrelated code.
CPython code favors consistency over personal preference.
89.18 Error Cleanup Patterns
Many CPython C functions use structured cleanup.
Example:
PyObject *x = NULL;
PyObject *y = NULL;
x = make_x();
if (x == NULL) {
goto error;
}
y = make_y();
if (y == NULL) {
goto error;
}
return y;
error:
Py_XDECREF(x);
Py_XDECREF(y);
return NULL;
This pattern centralizes cleanup and reduces leak risk.
Avoid duplicated cleanup logic spread across many returns.
89.19 Do Not Ignore Failure Paths
Every allocation and API call can fail.
Examples:
PyLong_FromLong
PyUnicode_FromString
PyObject_Call
PyList_New
PyDict_New
PyTuple_New
Always check:
if (obj == NULL) {
return NULL;
}
A patch that handles only the success path is incomplete.
89.20 Commit Messages
A good commit message is concise and descriptive.
Good:
Fix reference leak in dict merge error path
Weak:
fix stuff
The commit message should describe the semantic change, not the editing activity.
89.21 Run Broader Validation Before Submission
Before opening a pull request:
./python -m test -j0
If the patch touches sensitive runtime paths:
imports
GC
frames
dicts
compiler
interpreter loop
memory allocators
threading
run broader validation than usual.
A patch that passes only one focused test may still break unrelated behavior.
89.22 Read the Diff Carefully
Before submission:
git diff
Check for:
debug prints
temporary instrumentation
commented-out code
accidental whitespace changes
unrelated formatting
forgotten test edits
generated files
A clean diff is easier to review.
89.23 Open the Pull Request
A pull request should explain:
what the problem is
why the current behavior is wrong
what the patch changes
how it was tested
whether compatibility changes exist
Good PR descriptions reduce reviewer guesswork.
For regressions, include a minimal reproducer.
For performance patches, include benchmarks.
For semantic changes, include rationale.
89.24 Responding to Review
Code review is part of development, not a separate obstacle.
Common review requests:
add regression test
simplify logic
handle failure path
improve comments
clarify ownership
update documentation
rename variable
reduce scope
Respond technically and precisely.
Good response:
This path can fail because PyObject_Call may trigger arbitrary Python code. I added cleanup for x before returning NULL.
Weak response:
I think it should work now.
89.25 Backports
Bug fixes may need backports to maintenance branches.
Typical flow:
merge into main
backport to supported branches if appropriate
Compatibility matters during backporting.
A patch safe for the development branch may be too risky for a maintenance release.
89.26 Common Patch Mistakes
| Mistake | Better approach |
|---|---|
| Large unrelated refactor | Small focused patch |
| No regression test | Add minimal reproducer test |
| Ignoring error cleanup | Audit all exits |
| Missing docs | Update docs with behavior |
| Style rewrite in unrelated code | Preserve local style |
| Only testing success path | Test failures too |
| Assuming allocations succeed | Check all API returns |
| Using system Python accidentally | Use local build |
| No leak testing | Run -R for runtime changes |
89.27 Example Patch Workflow
Example end-to-end workflow:
1. Reproduce bug.
2. Reduce to minimal script.
3. Locate implementation.
4. Read existing tests.
5. Add failing regression test.
6. Build debug CPython.
7. Confirm failure.
8. Implement minimal fix.
9. Rebuild.
10. Run focused tests.
11. Run leak tests.
12. Run related tests.
13. Update docs if needed.
14. Add news entry.
15. Read final diff.
16. Open PR.
17. Respond to review.
This workflow scales from small fixes to major runtime work.
89.28 Core Principle
A CPython patch is a change to a living runtime system.
The code, tests, documentation, memory invariants, and public contracts evolve together. A correct patch fixes the problem, preserves surrounding invariants, explains the behavior clearly, and leaves the interpreter easier to trust than before.