Truncated Polynomial Algebras
Dual numbers capture first-order derivatives because the infinitesimal element satisfies
Truncated Polynomial Algebras
Dual numbers capture first-order derivatives because the infinitesimal element satisfies
$$ \varepsilon^2 = 0. $$
To compute higher-order derivatives, we extend this idea by allowing higher powers to survive temporarily before truncation.
The resulting structures are called truncated polynomial algebras.
They generalize dual numbers from first-order infinitesimals to finite-order infinitesimal expansions.
From Dual Numbers to Higher Orders
Recall the dual-number algebra:
$$ \mathbb{R}[\varepsilon]/(\varepsilon^2). $$
Every element has the form
$$ a_0 + a_1\varepsilon. $$
The condition
$$ \varepsilon^2 = 0 $$
removes all second-order and higher terms.
Now instead consider:
$$ \mathbb{R}[\varepsilon]/(\varepsilon^{k+1}). $$
Here powers up to $\varepsilon^k$ survive:
$$ a_0 + a_1\varepsilon + a_2\varepsilon^2 + \cdots + a_k\varepsilon^k. $$
Only terms of degree $k+1$ and higher vanish.
This algebra stores derivatives up to order $k$.
Example: Second-Order Algebra
The simplest nontrivial extension is:
$$ \mathbb{R}[\varepsilon]/(\varepsilon^3). $$
Elements have the form:
$$ a + b\varepsilon + c\varepsilon^2. $$
Now:
$$ \varepsilon^3 = 0, $$
but:
$$ \varepsilon^2 \neq 0. $$
Multiplication works by ordinary polynomial multiplication followed by truncation.
For example:
$$ (a+b\varepsilon+c\varepsilon^2) (d+e\varepsilon+f\varepsilon^2) $$
expands to:
$$ ad + (ae+bd)\varepsilon + (af+be+cd)\varepsilon^2 + \text{higher-order terms}. $$
All powers containing $\varepsilon^3$ or above vanish.
Taylor Expansion Interpretation
The key connection comes from Taylor expansion.
For a smooth function:
$$ f(x+h) = f(x) + f'(x)h + \frac12 f''(x)h^2 + \cdots. $$
Substitute a truncated infinitesimal:
$$ h = \varepsilon, \quad \varepsilon^{k+1}=0. $$
Then:
$$ f(x+\varepsilon) = f(x) + f'(x)\varepsilon + \frac12 f''(x)\varepsilon^2 + \cdots + \frac1{k!}f^{(k)}(x)\varepsilon^k. $$
The series terminates exactly.
Higher-order derivatives become coefficients in the algebra.
Second Derivative Example
Let:
$$ f(x)=x^3. $$
Use the algebra:
$$ \varepsilon^3=0. $$
Evaluate:
$$ (x+\varepsilon)^3. $$
Expanding:
$$= x^3 + 3x^2\varepsilon + 3x\varepsilon^2 + \varepsilon^3.$$
Since:
$$ \varepsilon^3=0, $$
we obtain:
$$ x^3 + 3x^2\varepsilon + 3x\varepsilon^2. $$
Compare with Taylor expansion:
$$ f(x+\varepsilon) = f(x) + f'(x)\varepsilon + \frac12 f''(x)\varepsilon^2. $$
Since:
$$ f'(x)=3x^2, $$
and
$$ f''(x)=6x, $$
we get:
$$ \frac12 f''(x)=3x. $$
The coefficients match exactly.
Structure of the Algebra
The truncated polynomial algebra
$$ \mathbb{R}[\varepsilon]/(\varepsilon^{k+1}) $$
is:
- commutative
- finite-dimensional
- associative
- local
- nilpotent
Its basis is:
$$ 1,\varepsilon,\varepsilon^2,\ldots,\varepsilon^k. $$
Dimension is:
$$ k+1. $$
Every element is represented uniquely as:
$$ \sum_{i=0}^{k} a_i\varepsilon^i. $$
Arithmetic remains finite because powers terminate.
First-Order Versus Higher-Order AD
Dual numbers compute first derivatives:
$$ f(x+\varepsilon) = f(x)+f'(x)\varepsilon. $$
Truncated polynomial algebras compute finite Taylor expansions:
$$ f(x+\varepsilon) = \sum_{i=0}^{k} \frac{f^{(i)}(x)}{i!}\varepsilon^i. $$
Thus:
| Algebra | Information Stored |
|---|---|
| $\varepsilon^2=0$ | First derivative |
| $\varepsilon^3=0$ | Second derivative |
| $\varepsilon^4=0$ | Third derivative |
| $\varepsilon^{k+1}=0$ | $k$-th derivatives |
Forward-mode higher-order AD is therefore polynomial propagation over truncated algebras.
Polynomial Propagation Through Programs
Suppose a program computes:
$$ f(x)=\sin(x^2). $$
Represent input as:
$$ x = a+\varepsilon. $$
Each intermediate variable becomes a truncated polynomial.
First compute:
$$ x^2 = a^2 + 2a\varepsilon + \varepsilon^2. $$
Then evaluate sine:
$$ \sin(a^2 + 2a\varepsilon + \varepsilon^2). $$
Taylor expansion automatically propagates derivative information through every intermediate computation.
Programs become finite Taylor-series transformers.
Jet Spaces
In differential geometry, truncated polynomial algebras correspond to jet spaces.
A $k$-jet stores derivatives up to order $k$.
Two smooth functions are equivalent at a point if their derivatives agree through order $k$.
The truncated algebra captures exactly this finite local behavior.
For example:
- dual numbers correspond to first jets
- cubic truncations correspond to second jets
- higher truncations correspond to higher jets
Automatic differentiation over truncated algebras therefore computes finite jets of programs.
Multiple Variables
For multivariate functions, introduce several infinitesimal generators:
$$ \varepsilon_1,\varepsilon_2,\ldots,\varepsilon_n. $$
A second-order algebra may satisfy:
$$ \varepsilon_i^3=0 $$
or:
$$ \varepsilon_i^2=0 $$
while mixed products survive.
A general element becomes:
$$ a + \sum_i b_i\varepsilon_i + \sum_{i,j} c_{ij}\varepsilon_i\varepsilon_j. $$
The coefficients encode:
- first derivatives
- mixed partial derivatives
- Hessian structure
This forms the basis of multivariate higher-order AD.
Example: Mixed Partial Derivatives
Let:
$$ f(x,y)=xy+\sin(xy). $$
Represent:
$$ x = a+\varepsilon_1 $$
$$ y = b+\varepsilon_2. $$
Then:
$$ xy = ab + b\varepsilon_1 + a\varepsilon_2 + \varepsilon_1\varepsilon_2. $$
The mixed product term survives.
Expanding the full function reveals coefficients involving:
$$ \varepsilon_1\varepsilon_2. $$
Those coefficients encode:
$$ \frac{\partial^2 f}{\partial x \partial y}. $$
Higher-order structure emerges directly from algebraic multiplication.
Computational Complexity
Higher-order propagation becomes expensive because coefficient counts grow rapidly.
For order $k$ in $n$ variables, the number of monomials is:
$$ \binom{n+k}{k}. $$
Growth is combinatorial.
For example:
| Variables | Order | Number of Terms |
|---|---|---|
| 1 | 2 | 3 |
| 2 | 2 | 6 |
| 5 | 3 | 56 |
| 10 | 4 | 1001 |
This combinatorial explosion is one of the main challenges in higher-order automatic differentiation.
Sparse and Structured Representations
Practical systems rarely store full dense polynomial expansions.
Instead they exploit:
- sparsity
- symmetry
- directional derivatives
- compressed Hessians
- low-rank structure
Methods include:
- truncated Taylor mode
- hyper-dual numbers
- Hessian-vector products
- checkpointed higher-order reverse mode
The algebraic viewpoint remains the same even when representations become optimized.
Relation to Formal Power Series
Truncated polynomial algebras are finite approximations to formal power series.
A formal power series has infinite expansion:
$$ \sum_{i=0}^{\infty} a_i\varepsilon^i. $$
Truncated algebras cut this off at finite order:
$$ \sum_{i=0}^{k} a_i\varepsilon^i. $$
Automatic differentiation computes finite local expansions rather than full analytic series.
This finite truncation keeps computation practical.
Program Semantics View
Ordinary execution interprets variables as real numbers:
$$ x \in \mathbb{R}. $$
First-order forward AD interprets them as dual numbers:
$$ x \in \mathbb{R}[\varepsilon]/(\varepsilon^2). $$
Higher-order AD interprets them as truncated polynomials:
$$ x \in \mathbb{R}[\varepsilon]/(\varepsilon^{k+1}). $$
The program structure remains unchanged.
Only the underlying algebra changes.
This is one of the deepest ideas in automatic differentiation: differentiation can be implemented by changing the algebra of evaluation.
Algebraic Summary
Truncated polynomial algebras generalize dual numbers by preserving higher-order infinitesimal terms.
The algebra:
$$ \mathbb{R}[\varepsilon]/(\varepsilon^{k+1}) $$
stores finite Taylor expansions up to order $k$.
Evaluating a function in this algebra produces:
$$ f(x+\varepsilon) = \sum_{i=0}^{k} \frac{f^{(i)}(x)}{i!}\varepsilon^i. $$
Thus:
- coefficients become derivatives
- multiplication propagates chain rules
- higher-order interactions emerge naturally
- differentiation becomes algebraic evaluation
Higher-order automatic differentiation is therefore computation over truncated polynomial algebras.