Hyper-Dual Numbers

Dual numbers compute first derivatives exactly. Truncated polynomial algebras extend this to higher-order derivatives, but practical higher-order differentiation introduces an...

Hyper-Dual Numbers

Dual numbers compute first derivatives exactly. Truncated polynomial algebras extend this to higher-order derivatives, but practical higher-order differentiation introduces an important problem: extracting second derivatives accurately without symbolic expansion or numerical cancellation.

Hyper-dual numbers solve this problem by introducing multiple nilpotent infinitesimal directions whose mixed products survive.

They provide an exact algebraic mechanism for computing:

  • second derivatives
  • mixed partial derivatives
  • Hessians

without finite differences and without truncation error.

Motivation

Ordinary dual numbers satisfy:

$$ \varepsilon^2 = 0. $$

Evaluating

$$ f(x+\varepsilon) $$

produces:

$$ f(x)+f'(x)\varepsilon. $$

Only first-order information survives.

To recover second derivatives, one possibility is nested dual numbers or truncated polynomial algebras. However, those approaches may:

  • increase implementation complexity
  • require managing higher polynomial coefficients
  • introduce perturbation confusion in nested systems

Hyper-dual numbers provide a cleaner construction for exact second-order differentiation.

The Hyper-Dual Algebra

Introduce two independent infinitesimal generators:

$$ \varepsilon_1,\varepsilon_2. $$

Require:

$$ \varepsilon_1^2 = 0 $$

$$ \varepsilon_2^2 = 0. $$

But preserve the mixed product:

$$ \varepsilon_1\varepsilon_2 \neq 0. $$

Also:

$$ (\varepsilon_1\varepsilon_2)^2 = 0. $$

A hyper-dual number has the form:

$$ a + b\varepsilon_1 + c\varepsilon_2 + d\varepsilon_1\varepsilon_2. $$

This algebra stores:

Component Meaning
$a$ primal value
$b$ first derivative in direction 1
$c$ first derivative in direction 2
$d$ mixed second derivative

Why Mixed Products Matter

The key idea is that:

$$ (\varepsilon_1+\varepsilon_2)^2 = 2\varepsilon_1\varepsilon_2. $$

The square does not vanish completely because cross terms survive.

This allows second-order information to appear algebraically.

Taylor Expansion

For a smooth scalar function:

$$ f(x+h), $$

the second-order Taylor expansion is:

$$ f(x+h) = f(x) + f'(x)h + \frac12 f''(x)h^2. $$

Now substitute:

$$ h = a\varepsilon_1 + b\varepsilon_2. $$

Since:

$$ \varepsilon_1^2 = \varepsilon_2^2 = 0, $$

the square becomes:

$$ h^2 = 2ab\varepsilon_1\varepsilon_2. $$

Thus:

$$ f(x+h) = f(x) + f'(x)(a\varepsilon_1+b\varepsilon_2) + f''(x)ab\varepsilon_1\varepsilon_2. $$

The coefficient of:

$$ \varepsilon_1\varepsilon_2 $$

is exactly the second derivative.

Example

Let:

$$ f(x)=x^3. $$

Use the hyper-dual input:

$$ x+\varepsilon_1+\varepsilon_2. $$

Expand:

$$ (x+\varepsilon_1+\varepsilon_2)^3. $$

First compute:

$$ (x+h)^3 = x^3 + 3x^2h + 3xh^2 + h^3. $$

Since:

$$ h=\varepsilon_1+\varepsilon_2, $$

and:

$$ h^2 = 2\varepsilon_1\varepsilon_2, $$

while:

$$ h^3=0, $$

we obtain:

$$ x^3 + 3x^2(\varepsilon_1+\varepsilon_2) + 6x\varepsilon_1\varepsilon_2. $$

Thus:

Coefficient Value
$1$ $x^3$
$\varepsilon_1$ $3x^2$
$\varepsilon_2$ $3x^2$
$\varepsilon_1\varepsilon_2$ $6x$

Since:

$$ f''(x)=6x, $$

the mixed coefficient gives the exact second derivative.

Multivariable Functions

Hyper-dual numbers naturally extend to multivariate functions.

Suppose:

$$ f : \mathbb{R}^n \to \mathbb{R}. $$

Choose two perturbation directions:

$$ u,v \in \mathbb{R}^n. $$

Evaluate:

$$ x + u\varepsilon_1 + v\varepsilon_2. $$

Then:

$$ f(x+u\varepsilon_1+v\varepsilon_2) $$

expands to:

$$ f(x) + Df_x(u)\varepsilon_1 + Df_x(v)\varepsilon_2 + u^T H_x v , \varepsilon_1\varepsilon_2. $$

The mixed coefficient gives the Hessian bilinear form:

$$ u^T H_x v. $$

This computes exact second-order directional derivatives.

Hessian Extraction

To compute a Hessian entry:

$$ \frac{\partial^2 f}{\partial x_i \partial x_j}, $$

seed:

$$ u=e_i, \quad v=e_j. $$

Then the coefficient of:

$$ \varepsilon_1\varepsilon_2 $$

is exactly:

$$ H_{ij}. $$

Repeated evaluation recovers the full Hessian matrix.

Example: Two Variables

Let:

$$ f(x,y)=x^2y+\sin(xy). $$

Choose perturbations:

$$ x \mapsto x+\varepsilon_1 $$

$$ y \mapsto y+\varepsilon_2. $$

Then:

$$ xy = xy + y\varepsilon_1 + x\varepsilon_2 + \varepsilon_1\varepsilon_2. $$

Mixed terms appear automatically.

Expanding the entire function produces coefficients involving:

$$ \varepsilon_1\varepsilon_2, $$

which equal:

$$ \frac{\partial^2 f}{\partial x\partial y}. $$

No symbolic differentiation is needed.

Exactness

Hyper-dual differentiation is exact up to floating-point arithmetic.

Unlike finite differences:

Method Error Source
Finite differences truncation + cancellation
Symbolic differentiation expression explosion
Hyper-dual numbers floating-point only

No step size is required.

No subtraction cancellation occurs.

The derivative structure emerges algebraically.

Algebraic Structure

The hyper-dual algebra can be written:

$$ \mathbb{R}[\varepsilon_1,\varepsilon_2] / (\varepsilon_1^2,\varepsilon_2^2). $$

Basis elements are:

$$ 1, \varepsilon_1, \varepsilon_2, \varepsilon_1\varepsilon_2. $$

Dimension is four.

Multiplication rules:

Product Result
$\varepsilon_1^2$ $0$
$\varepsilon_2^2$ $0$
$\varepsilon_1\varepsilon_2$ survives
$(\varepsilon_1\varepsilon_2)^2$ $0$

This carefully chosen nilpotent structure isolates second-order interactions.

Computational Interpretation

A hyper-dual number may be represented as:

type HyperDual struct {
    Val  float64
    D1   float64
    D2   float64
    D12  float64
}

Components represent:

Field Meaning
Val primal value
D1 first derivative along direction 1
D2 first derivative along direction 2
D12 mixed second derivative

Multiplication Rule

Suppose:

$$ x=(a,b,c,d) $$

and

$$ y=(p,q,r,s). $$

Then multiplication becomes:

$$ xy= ( ap, aq+bp, ar+cp, as+br+cq+dp ). $$

The mixed term obeys the second-order product rule automatically.

Example Implementation

func Mul(x, y HyperDual) HyperDual {
    return HyperDual{
        Val: x.Val * y.Val,

        D1:
            x.D1*y.Val +
            x.Val*y.D1,

        D2:
            x.D2*y.Val +
            x.Val*y.D2,

        D12:
            x.D12*y.Val +
            x.D1*y.D2 +
            x.D2*y.D1 +
            x.Val*y.D12,
    }
}

The D12 component contains all mixed second-order interactions.

Relation to Hessian-Vector Products

Hyper-dual numbers compute second-order directional derivatives naturally.

Given:

$$ u^T H v, $$

evaluate:

$$ x + u\varepsilon_1 + v\varepsilon_2. $$

The coefficient of:

$$ \varepsilon_1\varepsilon_2 $$

is the result.

This avoids explicit Hessian construction.

For large systems, Hessian-vector products are often preferable to dense Hessians.

Perturbation Confusion

Nested dual-number systems may accidentally mix perturbation symbols.

Hyper-dual numbers avoid this by explicitly separating infinitesimal generators:

$$ \varepsilon_1, \varepsilon_2. $$

Each perturbation direction remains algebraically distinct.

This improves correctness in higher-order implementations.

Relation to Truncated Polynomial Algebras

Hyper-dual numbers differ from ordinary truncated polynomial algebras.

Truncated polynomial algebra:

$$ \mathbb{R}[\varepsilon]/(\varepsilon^3) $$

keeps powers:

$$ 1,\varepsilon,\varepsilon^2. $$

Hyper-dual algebra instead keeps:

$$ 1, \varepsilon_1, \varepsilon_2, \varepsilon_1\varepsilon_2. $$

This distinction matters:

Structure Stores
Truncated polynomial repeated derivatives
Hyper-dual mixed derivatives

Hyper-dual systems are particularly effective for Hessian computation.

Complexity

For $n$ variables:

  • one forward dual pass computes one directional derivative
  • one hyper-dual pass computes one second-order directional interaction

Dense Hessian construction still requires multiple evaluations.

However, the method remains exact and compositional.

Geometric Interpretation

Dual numbers represent tangent vectors.

Hyper-dual numbers represent interacting tangent directions.

The mixed product:

$$ \varepsilon_1\varepsilon_2 $$

captures curvature.

First-order infinitesimals describe local linear geometry.

Second-order mixed infinitesimals describe local quadratic geometry.

Hyper-dual numbers therefore encode second-order local structure.

Summary

Hyper-dual numbers extend dual numbers by introducing multiple independent nilpotent directions whose mixed products survive.

The algebra:

$$ \mathbb{R}[\varepsilon_1,\varepsilon_2] / (\varepsilon_1^2,\varepsilon_2^2) $$

produces exact second derivatives through ordinary program evaluation.

Key properties:

Feature Result
Independent infinitesimals separate derivative directions
Mixed products survive second-order information
No finite differences exact differentiation
Local algebraic propagation automatic Hessian computation
Structured nilpotency stable higher-order AD

Hyper-dual numbers provide one of the cleanest exact formulations of second-order automatic differentiation.