Skip to main content

Garbage collection

CPython's primary memory reclamation is refcounting: when a refcount drops to zero, the object is freed immediately. Cycles defeat refcounting, so CPython adds a cycle collector. a generational, stop-the-world tracing pass that runs periodically to collect garbage cycles. The cycle collector is what lets Python programs cheerfully build doubly linked structures, parent pointers, and observer registrations without leaking.

Where the code lives

FileRole
Python/gc.cThe cycle collector. Generations, traversal, finalisation.
Python/gc_free_threading.cThe free-threaded variant.
Include/internal/pycore_gc.hPyGC_Head, generation thresholds, flags.
Modules/gcmodule.cThe gc module: gc.collect, callbacks, debugging knobs.

Tracked objects

Only objects whose type sets Py_TPFLAGS_HAVE_GC participate in cycle collection. Containers (list, dict, tuple, set, class instances) are tracked; immutable leaves (int, str, float, bytes) are not, because they cannot form cycles.

Each tracked object carries a PyGC_Head before its PyObject:

/* Include/internal/pycore_gc.h PyGC_Head */
typedef struct {
uintptr_t _gc_next; /* next in generation list (LSB: tracked bit) */
uintptr_t _gc_prev; /* prev in generation list (and refcount copy) */
} PyGC_Head;

Allocation uses _PyObject_GC_New / _PyObject_GC_NewVar, which prepend the header and link the object into a generation. The linked list is doubly linked so the collector can splice nodes in and out.

Py_TRASHCAN_BEGIN/Py_TRASHCAN_END in tp_dealloc implementations break long chains of recursive deallocation that would otherwise blow the C stack; they reschedule the dealloc to run after the current one finishes.

Generations

Three generations: young (0), middle (1), old (2). Newly allocated tracked objects go into generation 0. After a collection, surviving objects are promoted to the next generation.

Thresholds control when each generation is collected:

gc.get_threshold() # (700, 10, 10)
  • Generation 0 is collected when its allocation count minus deallocation count exceeds 700.
  • Generation 1 is collected after generation 0 is collected 10 times.
  • Generation 2 is collected after generation 1 is collected 10 times.

The defaults reflect the generational hypothesis: most cycles are young, so collecting young objects often is cheap and catches most garbage; old objects rarely become collectable.

The algorithm

To find unreachable cycles in a generation, the collector runs subtract refcounts:

  1. Copy each tracked object's refcount into a scratch field (_gc_prev).
  2. For each tracked object, call tp_traverse(obj, decref_cb); for every referent that is also tracked in this generation, decrement its scratch refcount. After this pass, the scratch refcount equals the number of references from outside the generation.
  3. Any object whose scratch refcount is zero is reachable only from inside the generation, but it might still be reachable from a cycle that has an external reference. Walk the graph marking transitively-reachable objects starting from those with non-zero scratch refcount.
  4. The unreached set is garbage.
/* Python/gc.c (sketch) */
static Py_ssize_t collect_with_callback(int generation) {
/* update refs, subtract refs, move unreachable, finalise */
}

Finalisation

Before reclaiming the unreached set, the collector runs each object's tp_finalize (the C-level __del__). PEP 442 made this safe: a resurrected object (one that became reachable again during finalisation) is detected and not freed.

After finalisation:

  • Weak references with callbacks are queued.
  • The unreached list is broken (each tp_clear is called), which lets refcounting cascade reclaim the objects.

The final reclamation goes through normal Py_DECREF paths; the collector itself just breaks the cycle.

tp_traverse and tp_clear

Every GC-tracked type implements two slots:

typedef int (*traverseproc)(PyObject *, visitproc, void *);
typedef int (*inquiry)(PyObject *);
  • tp_traverse(self, visit, arg) calls visit(referent, arg) for each Python-object reference the type holds. The collector passes a callback that updates scratch refcounts.
  • tp_clear(self) releases the references that participate in cycles. Implementations clear container slots to NULL so the refcount drops trigger cleanly.

A common pattern:

static int my_traverse(MyObject *self, visitproc visit, void *arg) {
Py_VISIT(self->referent);
return 0;
}
static int my_clear(MyObject *self) {
Py_CLEAR(self->referent);
return 0;
}

Permanent generation

gc.freeze() moves every tracked object into a permanent generation that is never collected. This is useful at the end of process startup: most of the imported module graph will live for the life of the process, so excluding it from cycle scans makes later collections faster. gc.unfreeze() restores normal generations.

Callbacks

gc.callbacks is a list of callables invoked before and after each collection. The garbage collector's instrumentation hook is how tracemalloc, objgraph, and similar tools attach. The callback receives ("start"|"stop", info) with collection stats.

Free-threaded variant

Python/gc_free_threading.c is the no-GIL collector. The major changes:

  • The collector still stops the world, but does so by signalling every thread to pause at a safe point.
  • Refcounts use the split-counter representation (ob_ref_local
    • ob_ref_shared); the subtract-refcounts pass adds across both fields.
  • Allocation is per-thread (separate freelists, separate generation lists), reducing contention.

Cycle collection cost in the free-threaded build is higher per collection (more bookkeeping) but happens less often because the allocator is per-thread.

CPython 3.14 changes

  • Incremental collection of the oldest generation. Long-pause collections are split across multiple collection events; pauses stay short even when many old objects need to be scanned.
  • gc.callbacks info now includes counts of objects per generation and the number of permanent-generation entries.
  • Free-threaded GC is no longer experimental.

PEP touchpoints

  • PEP 442. Safe object finalisation; cycles with __del__ are now collectable.
  • PEP 683. Immortal objects (not tracked).
  • PEP 703. Free-threaded collector.

Reference

  • Python/gc.c, Python/gc_free_threading.c, Include/internal/pycore_gc.h, Modules/gcmodule.c.
  • PEP 442. Safe object finalization.
  • Jones, Hosking, Moss. The Garbage Collection Handbook. Generational collection.