Robotics and Control

Robotics and control systems interact with the physical world through sensing, estimation, planning, and actuation. Automatic differentiation is important because modern control pipelines increasingly depend on optimization, simulation, system identification, and differentiable models.

A robotic system is often represented as a dynamical system:

$$ x_{t+1}=f(x_t,u_t,\theta), $$

where:

Symbol	Meaning
$x_t$	system state
$u_t$	control input
$\theta$	physical or model parameters

A controller chooses actions

$$ u_t = \pi(x_t), $$

to optimize some objective.

The objective may involve trajectory tracking, energy consumption, stability, collision avoidance, or task completion. AD computes derivatives needed for optimization, planning, estimation, and learning.

Dynamics as Differentiable Programs

A robot simulator is a computational graph:

state
    -> dynamics
    -> integration
    -> contact resolution
    -> sensor model
    -> cost function

Differentiating this graph gives sensitivities of trajectories and objectives with respect to controls, parameters, or initial conditions.

This enables:

trajectory optimization,
model predictive control,
differentiable simulation,
policy learning,
system identification,
calibration,
inverse dynamics.

Equations of Motion

Rigid-body dynamics are commonly written as

$$ M(q)\ddot q + C(q,\dot q)\dot q + g(q)=\tau, $$

where:

Symbol	Meaning
$q$	generalized coordinates
$M(q)$	mass matrix
$C(q,\dot q)$	Coriolis and centrifugal terms
$g(q)$	gravity
$\tau$	generalized forces

The state is typically

$$ x=(q,\dot q). $$

A simulator numerically integrates the resulting ODE or DAE.

AD computes derivatives of trajectories or costs with respect to:

torques,
masses,
inertias,
geometry,
friction coefficients,
controller parameters.

Trajectory Optimization

Trajectory optimization searches for a control sequence minimizing a cost:

$$ \min_{u_0,\ldots,u_{T-1}} \sum_{t=0}^{T-1} \ell(x_t,u_t) + \ell_T(x_T), $$

subject to dynamics constraints

$$ x_{t+1}=f(x_t,u_t). $$

The gradient of the total objective depends on derivatives propagated through the dynamics.

Reverse-mode AD naturally computes these gradients because the trajectory is a sequential computation graph.

Optimal Control and Adjoint Equations

Continuous-time optimal control uses dynamics

$$ \dot x=f(x,u), $$

and cost

$$ J = \int_0^T \ell(x(t),u(t)) dt + \phi(x(T)). $$

Pontryagin's principle introduces the adjoint state

$$ \lambda(t). $$

The Hamiltonian is

$$ H(x,u,\lambda) = \ell(x,u) + \lambda^\top f(x,u). $$

The adjoint equation is

$$ \dot\lambda = - \frac{\partial H}{\partial x}. $$

This is mathematically equivalent to reverse-mode differentiation through the trajectory.

The connection between control theory and reverse-mode AD is deep:

Control theory	AD interpretation
Adjoint state	Reverse gradient
Costate equation	Backward sensitivity propagation
Hamiltonian derivatives	Local Jacobian actions
Shooting method	Gradient-based trajectory optimization

Model Predictive Control

Model predictive control repeatedly solves a finite-horizon optimization problem:

estimate current state,
optimize future controls,
apply first action,
repeat.

Each optimization uses system derivatives.

AD is useful because modern MPC systems may contain:

nonlinear dynamics,
learned components,
differentiable constraints,
neural cost terms,
differentiable collision models.

Instead of deriving gradients manually, the controller differentiates the simulation and objective directly.

Differentiable Simulation

A differentiable simulator exposes derivatives of simulated outcomes with respect to simulation inputs.

Suppose simulation evolves:

$$ x_{t+1}=\Phi(x_t,u_t,\theta). $$

A differentiable simulator computes:

$$ \frac{\partial x_T}{\partial u_t}, \qquad \frac{\partial x_T}{\partial \theta}, \qquad \frac{\partial L}{\partial x_t}. $$

Applications include:

Application	Purpose
Robot design	Optimize morphology
Policy learning	Differentiate reward
System identification	Fit physical parameters
Sim-to-real transfer	Adapt simulator parameters
Grasp optimization	Optimize contact behavior
Motion planning	Optimize trajectories

Contact Dynamics

Contact is one of the hardest parts of differentiable robotics.

Rigid contact often introduces discontinuities:

$$ v^+ = R(v^-), $$

where collisions instantaneously change velocity.

Friction introduces complementarity conditions:

$$ 0 \le \lambda_n \perp \phi(q)\ge 0. $$

These systems are piecewise smooth or non-smooth.

Naive AD through contact solvers may produce:

undefined gradients,
unstable sensitivities,
zero gradients,
discontinuous optimization behavior.

Common strategies include:

Strategy	Idea
Soft contact	Replace hard contact with smooth penalty
Implicit differentiation	Differentiate converged contact solve
Relaxed complementarity	Smooth inequality conditions
Hybrid methods	Analytical contact derivatives

Contact differentiation remains an active research area.

Inverse Kinematics

Inverse kinematics solves for joint angles producing a desired end-effector pose.

$$ x = f(q), $$

then inverse kinematics solves

$$ f(q)=x^*. $$

Optimization form:

$$ L(q)=\frac{1}{2}|f(q)-x^*|^2. $$

The Jacobian

$$ J(q)=\frac{\partial f}{\partial q} $$

maps joint velocities to task-space velocities:

$$ \dot x = J(q)\dot q. $$

AD computes these Jacobians automatically, especially for complex articulated systems.

State Estimation

Robotic systems estimate hidden states from sensor measurements.

A state estimator may combine:

inertial sensors,
cameras,
lidar,
wheel encoders,
GPS,
force sensors.

An optimization-based estimator minimizes residuals:

$$ L(x) = \sum_i |r_i(x)|^2. $$

AD computes Jacobians needed for Gauss-Newton or Levenberg-Marquardt optimization.

This is especially useful in SLAM and visual-inertial odometry, where residual structures are large and sparse.

Differentiable Perception

Modern robotic pipelines often integrate learned perception systems.

Example:

camera image
    -> neural perception model
    -> object pose estimate
    -> planner
    -> controller
    -> robot action

If the pipeline is differentiable end-to-end, gradients can flow from task objectives back into perception modules.

This enables:

task-aware perception,
differentiable sensor calibration,
policy gradients through perception,
learned observation models.

System Identification

System identification estimates physical parameters from observed trajectories.

Suppose a simulator predicts

$$ x_t(\theta). $$

Observed trajectories are

$$ \hat x_t. $$

The objective is

$$ L(\theta) = \sum_t |x_t(\theta)-\hat x_t|^2. $$

AD computes

$$ \nabla_\theta L. $$

This allows fitting:

masses,
friction coefficients,
motor constants,
damping,
actuator delays,
aerodynamic parameters.

Differentiable simulation has become a major tool for simulator calibration.

Reinforcement Learning and Control

Many reinforcement learning systems are differentiable control systems.

A policy

$$ u_t=\pi_\theta(x_t) $$

interacts with dynamics:

$$ x_{t+1}=f(x_t,u_t). $$

The objective is expected return:

$$ J(\theta) = \mathbb{E} \left[ \sum_t r(x_t,u_t) \right]. $$

If the environment is differentiable, gradients can propagate directly through the dynamics. This often gives lower-variance updates than score-function estimators.

However, real environments contain discontinuities, stochasticity, and unmodeled effects. Pure differentiable control is therefore usually combined with robust or stochastic methods.

Sparse Structure

Robotic systems have structured Jacobians.

Examples:

Structure	Consequence
Kinematic trees	Block-sparse derivatives
Local contacts	Sparse coupling
Sequential dynamics	Banded time structure
Factor graphs	Sparse estimation systems

Efficient robotics AD systems exploit this sparsity rather than forming dense Jacobians.

Real-Time Constraints

Control systems often run under strict timing constraints.

Application	Typical timing
Motor control	microseconds to milliseconds
MPC	milliseconds
Flight control	sub-millisecond stability loops
SLAM updates	real-time sensor rates

Generic AD frameworks may be too slow or memory-heavy.

Production systems often use:

custom derivative kernels,
ahead-of-time code generation,
symbolic simplification,
sparse linear algebra,
static computational graphs.

The derivative system must fit real-time constraints.

Numerical Stability

Robotics gradients can become unstable because of:

Cause	Effect
Chaotic contact sequences	Sensitive trajectories
Long horizons	Exploding or vanishing gradients
Poor scaling	Ill-conditioned optimization
Hard constraints	Non-smooth derivatives
Integrator error	Gradient mismatch
Solver tolerances	Noisy adjoints

Good differentiable robotics systems carefully define solver semantics and smoothing behavior.

Differentiable Robot Design

Robot morphology itself can become an optimization variable.

Parameters may include:

link lengths,
masses,
actuator placement,
joint limits,
sensor locations.

A differentiable simulator computes how morphology affects task performance.

Optimization becomes:

$$ \min_\theta L(\theta), $$

where $\theta$ now defines robot structure rather than controller parameters.

This creates co-design systems where body and controller are optimized jointly.

Practical Architecture

A robust differentiable robotics stack typically separates:

Layer	Responsibility
Geometry	Kinematics and transforms
Dynamics	Equations of motion
Contact	Collision and friction
Integration	Time stepping
Estimation	Sensor fusion and optimization
Planning	Trajectory optimization
Control	Policy or feedback law
Learning	Gradient-based adaptation

Each layer should expose well-defined derivative rules.

Failure Modes

Differentiable robotics systems fail in characteristic ways.

Failure mode	Cause
Exploding trajectory gradients	Long unstable horizons
Zero contact gradients	Hard collision thresholds
Simulator mismatch	Real-world physics differs
Memory explosion	Reverse mode through long trajectories
Unstable optimization	Ill-conditioned dynamics
Nonphysical learned behavior	Weak constraints
Timing failure	AD overhead violates real-time limits

Many practical systems intentionally smooth dynamics or truncate gradients to maintain optimization stability.

Summary

Robotics and control systems are naturally differentiable because they evolve through structured dynamical equations. Automatic differentiation provides gradients for trajectory optimization, control, estimation, simulation, and learning.

The main challenges are contact discontinuities, long-horizon stability, sparse structure, real-time execution, and differentiating through numerical solvers. Effective systems combine AD with optimal control theory, sparse numerical methods, custom solver derivatives, and carefully designed simulation semantics.