61. The Python C API

Public C API overview, Include/ header organization, and the stable vs. internal API distinction.

61. The Python C API

The Python C API is the native programming interface exposed by CPython. It allows C programs to create Python objects, execute Python code, define extension modules, implement new object types, call Python functions, manipulate interpreter state, and embed the Python runtime inside larger applications.

The C API is one of the central architectural features of CPython. It defines the boundary between the interpreter runtime and native machine code.

Most high-performance Python libraries depend on it directly or indirectly:

Project Use of C API
NumPy Array objects and vectorized kernels
pandas DataFrame internals and fast parsing
lxml XML parser bindings
Pillow Image codecs and pixel operations
psycopg Database driver integration
PyTorch Tensor runtime and Python bindings

Without the C API, CPython would mainly be an interpreter. With the C API, CPython becomes a systems integration platform.

61.1 What the C API Provides

The API exposes operations for:

creating Python objects
accessing object attributes
calling Python functions
implementing new types
raising exceptions
managing memory
interacting with threads
executing Python code
importing modules
embedding interpreters
extending the runtime

The API is primarily declared in header files under:

Include/

Core headers include:

Header Purpose
Python.h Main public entry point
object.h Core object structures
unicodeobject.h Unicode APIs
listobject.h List APIs
dictobject.h Dictionary APIs
tupleobject.h Tuple APIs
moduleobject.h Module APIs
cpython/ CPython-specific internal declarations

Almost every extension starts with:

#include <Python.h>

This header pulls in the public API surface and platform abstractions.

61.2 CPython’s Runtime Model at the C Level

At the C level, every Python value is represented by a pointer to PyObject.

Conceptually:

typedef struct _object {
    Py_ssize_t ob_refcnt;
    PyTypeObject *ob_type;
} PyObject;

All objects begin with this header.

This means the C API operates almost entirely on object pointers:

PyObject *

A Python integer:

x = 42

becomes something like:

PyLongObject *

A Python string becomes:

PyUnicodeObject *

A Python list becomes:

PyListObject *

But all can be treated generically as:

PyObject *

This is the foundation of CPython polymorphism.

61.3 The Central Role of PyObject

The API is object-oriented in C.

Every operation accepts or returns PyObject *.

Examples:

PyObject *PyLong_FromLong(long v);
PyObject *PyUnicode_FromString(const char *s);
PyObject *PyObject_CallObject(PyObject *callable, PyObject *args);

This style gives the API several properties:

Property Meaning
Dynamic typing Object type known at runtime
Uniform interface Same pointer abstraction everywhere
Extensibility New types integrate naturally
Runtime dispatch Behavior controlled by type object

The type object determines behavior:

addition
comparison
attribute lookup
iteration
calling
hashing
buffer support
memory layout

Internally this resembles a manually constructed object system implemented in C structures and function pointers.

61.4 The Include Hierarchy

The public API is divided into layers.

Stable public API

Headers safe for extension authors:

Include/

CPython-specific API

Implementation details:

Include/cpython/

Internal runtime API

Private interpreter internals:

Include/internal/

The distinction matters because many fields and functions are not ABI stable.

Example:

PyObject *

is public.

But direct manipulation of interpreter internals may require:

#include "internal/pycore_runtime.h"

which is unsupported outside CPython itself.

61.5 Building an Extension Module

A native extension is usually a shared library loaded by CPython at runtime.

Typical filenames:

Platform Extension
Linux .so
macOS .so
Windows .pyd

Minimal module:

#include <Python.h>

static PyObject *
hello(PyObject *self, PyObject *args)
{
    printf("hello from C\n");
    Py_RETURN_NONE;
}

static PyMethodDef Methods[] = {
    {"hello", hello, METH_NOARGS, "Print hello"},
    {NULL, NULL, 0, NULL}
};

static struct PyModuleDef module = {
    PyModuleDef_HEAD_INIT,
    "demo",
    NULL,
    -1,
    Methods
};

PyMODINIT_FUNC
PyInit_demo(void)
{
    return PyModule_Create(&module);
}

After compilation:

import demo
demo.hello()

calls directly into native machine code.

The interpreter dynamically loads the shared library, resolves PyInit_demo, and registers the module object.

61.6 PyMethodDef

Functions exported to Python are declared using PyMethodDef.

Structure:

typedef struct PyMethodDef {
    const char  *ml_name;
    PyCFunction  ml_meth;
    int          ml_flags;
    const char  *ml_doc;
} PyMethodDef;

Fields:

Field Meaning
ml_name Python-visible name
ml_meth C function pointer
ml_flags Calling convention
ml_doc Docstring

Example:

{"add", add, METH_VARARGS, "Add two numbers"}

This binds Python-level names to native implementations.

61.7 Calling Conventions

CPython supports multiple calling conventions.

METH_NOARGS

No Python arguments:

static PyObject *
f(PyObject *self, PyObject *unused)

METH_VARARGS

Tuple-based arguments:

static PyObject *
f(PyObject *self, PyObject *args)

METH_KEYWORDS

Positional and keyword arguments:

static PyObject *
f(PyObject *self,
  PyObject *args,
  PyObject *kwargs)

METH_FASTCALL

Modern fast calling convention.

Avoids temporary tuple creation.

The interpreter heavily optimizes this path in modern CPython.

61.8 Argument Parsing

CPython converts Python arguments into C values using parsing helpers.

Example:

static PyObject *
add(PyObject *self, PyObject *args)
{
    int a;
    int b;

    if (!PyArg_ParseTuple(args, "ii", &a, &b)) {
        return NULL;
    }

    return PyLong_FromLong(a + b);
}

Format string:

"ii"

means:

parse two integers

Common format codes:

Code Meaning
i int
l long
d double
s UTF-8 string
O generic Python object
p boolean

If parsing fails:

return NULL

and the interpreter propagates the exception.

61.9 Exceptions in the C API

Exceptions are represented implicitly.

A C function signals failure by:

returning NULL

and setting interpreter exception state.

Example:

PyErr_SetString(PyExc_ValueError,
                "invalid value");

return NULL;

The current thread state stores the active exception.

Conceptually:

thread state
    current exception type
    current exception value
    traceback

This differs from C++ exceptions.

The API uses explicit return-value checking.

Typical pattern:

PyObject *obj = some_call();

if (obj == NULL) {
    return NULL;
}

Failure propagation is manual.

61.10 Reference Counting in the C API

Reference ownership is the most important rule in the C API.

Each PyObject * has a reference count.

Operations either:

create references
borrow references
steal references
destroy references

Core macros:

Py_INCREF(obj);
Py_DECREF(obj);
Py_XDECREF(obj);

Reference bugs are the most common source of extension crashes.

Example

PyObject *x = PyLong_FromLong(42);

returns a new reference.

You own it.

Eventually:

Py_DECREF(x);

must happen.

Otherwise the object leaks.

61.11 Borrowed vs New References

The API distinguishes ownership explicitly.

Type Meaning
New reference Caller owns reference
Borrowed reference Caller does not own reference
Stolen reference Ownership transferred

Example:

PyObject *item = PyList_GetItem(list, 0);

returns a borrowed reference.

Do not decref it unless you first incref it.

Example:

PyList_SetItem(list, i, obj);

steals a reference.

The list now owns the object reference.

Misunderstanding ownership rules causes:

memory leaks
double frees
use-after-free
dangling pointers
interpreter corruption

61.12 Type Checking

CPython exposes runtime type checks.

Examples:

PyLong_Check(obj)
PyUnicode_Check(obj)
PyList_Check(obj)
PyDict_Check(obj)

These validate runtime object types before operations.

Example:

if (!PyLong_Check(obj)) {
    PyErr_SetString(PyExc_TypeError,
                    "expected int");
    return NULL;
}

Most APIs assume correct object types.

Incorrect assumptions may crash the interpreter.

61.13 Creating Python Objects

The API provides constructors for built-in types.

Integers

PyObject *x = PyLong_FromLong(123);

Floats

PyObject *x = PyFloat_FromDouble(3.14);

Unicode

PyObject *x = PyUnicode_FromString("hello");

Lists

PyObject *list = PyList_New(0);

Dicts

PyObject *dict = PyDict_New();

All return heap-allocated Python objects.

Reference ownership rules apply immediately.

61.14 Calling Python from C

The API allows native code to invoke Python callables.

Example:

PyObject *result =
    PyObject_CallObject(func, args);

Variants:

Function Purpose
PyObject_CallObject Generic call
PyObject_CallFunction Format-string call
PyObject_Vectorcall Fast modern call
PyObject_CallMethod Call named method

This enables hybrid execution:

Python code
    ↓
C extension
    ↓
Python callback
    ↓
more native code

Many scientific libraries use this extensively.

61.15 Attribute Access

Attributes can be manipulated directly.

Get attribute

PyObject *x =
    PyObject_GetAttrString(obj, "name");

Set attribute

PyObject_SetAttrString(obj,
                       "name",
                       value);

Check attribute

PyObject_HasAttrString(obj, "name");

This uses the normal Python attribute lookup system:

instance dict
class dict
descriptors
MRO traversal

The C API interacts with the same semantics as Python code.

61.16 Importing Modules

Modules can be imported from C.

Example:

PyObject *math =
    PyImport_ImportModule("math");

Access function:

PyObject *sqrt =
    PyObject_GetAttrString(math, "sqrt");

Call function:

PyObject *args =
    PyTuple_Pack(1,
                 PyFloat_FromDouble(9.0));

PyObject *result =
    PyObject_CallObject(sqrt, args);

This allows embedded runtimes to drive Python dynamically.

61.17 Embedding Python

The API supports embedding CPython inside native programs.

Initialization:

Py_Initialize();

Execute code:

PyRun_SimpleString(
    "print('hello from embedded python')"
);

Shutdown:

Py_Finalize();

Applications using embedding include:

Category Example
Game engines Scripting systems
Scientific software User automation
Databases Stored procedures
Editors Plugin systems
Network appliances Embedded configuration

Embedding reverses the normal relationship:

normal:
    Python → C extension

embedding:
    C application → embedded Python

61.18 The Global Interpreter Lock in the API

Thread interaction requires GIL management.

Many APIs require the calling thread to hold the GIL.

Acquire:

PyGILState_Ensure();

Release:

PyGILState_Release();

Long-running native code can release the GIL:

Py_BEGIN_ALLOW_THREADS

long_native_operation();

Py_END_ALLOW_THREADS

This allows parallel native execution while suspending Python bytecode execution for that thread.

61.19 Stable ABI and Limited API

CPython exposes two compatibility layers.

Full C API

Direct access to CPython internals.

Highest performance.

Least stable ABI.

Limited API

Restricted API subset.

Stable across Python versions.

Used with:

#define Py_LIMITED_API

Extensions targeting the stable ABI can ship one wheel compatible across multiple Python versions.

Tradeoff:

Full API Limited API
Maximum speed Greater compatibility
Access to internals Restricted features
Tighter coupling ABI stability

61.20 Internal vs Public APIs

Not all APIs are public.

CPython internally uses:

_PyRuntime
_PyInterpreterState
_PyEval_EvalFrameDefault
_PyObject_Vectorcall

Many internal functions begin with:

_Py

These are implementation details.

They may change between releases without compatibility guarantees.

Extension authors should avoid depending on internal APIs unless absolutely necessary.

61.21 Memory Allocators

The API exposes custom memory allocation layers.

General allocator:

PyMem_Malloc
PyMem_Realloc
PyMem_Free

Object allocator:

PyObject_Malloc
PyObject_Free

CPython internally uses specialized allocators such as:

pymalloc
arena allocators
free lists
small-object pools

Memory allocation strategy strongly affects interpreter performance.

61.22 Error Handling Philosophy

The C API uses explicit error handling everywhere.

Most APIs follow this pattern:

Return value Meaning
non-NULL success
NULL exception occurred

or:

Return value Meaning
0 success
-1 failure

The interpreter never assumes success automatically.

This makes CPython code verbose but predictable.

Typical pattern:

obj = PyObject_Call(...);

if (obj == NULL) {
    return NULL;
}

61.23 CPython’s API Design Style

The API reflects CPython’s history.

Characteristics:

Property Description
Manual memory management Explicit ownership
C89 compatibility origins Historical portability
Macro-heavy design Performance-oriented
Runtime polymorphism Type-object dispatch
Explicit error propagation No hidden exceptions
Refcount semantics Central ownership model

The API prioritizes:

performance
interpreter integration
portability
backward compatibility
incremental evolution

rather than modern language ergonomics.

61.24 Relationship Between Python Semantics and the C API

The API mirrors Python semantics closely.

Python operation:

x + y

maps internally to:

PyNumber_Add

Attribute access:

obj.name

maps to:

PyObject_GetAttr

Function calls:

f(a, b)

map to:

PyObject_Call

The C API is effectively a low-level interface to the interpreter’s object protocol system.

61.25 Why the C API Matters

The C API defines much of CPython’s ecosystem architecture.

It enables:

scientific computing
GPU runtimes
database bindings
cryptography
operating system integration
network libraries
language bindings
high-performance parsers
embedded scripting

It also constrains CPython evolution.

Because many extensions depend on:

reference counts
object layout
GIL behavior
type object structure
calling conventions

major runtime changes require compatibility strategies.

This tension shapes many modern CPython engineering decisions.

61.26 Chapter Summary

The Python C API is the native interface to the CPython runtime. It exposes object manipulation, type systems, memory management, function calls, module creation, interpreter control, and embedding capabilities through a large C-level interface centered on PyObject *.

The API operates through explicit ownership rules, runtime polymorphism, manual reference counting, and explicit error propagation. Extension modules use it to integrate native machine code with Python semantics, while embedded applications use it to host the interpreter inside larger systems.

Understanding the C API is essential for studying extension modules, runtime internals, object implementation, memory management, interpreter execution, and CPython ecosystem architecture.