Robotics and Control

Robotics and control systems interact with the physical world through sensing, estimation, planning, and actuation. Automatic differentiation is important because modern...

Robotics and Control

Robotics and control systems interact with the physical world through sensing, estimation, planning, and actuation. Automatic differentiation is important because modern control pipelines increasingly depend on optimization, simulation, system identification, and differentiable models.

A robotic system is often represented as a dynamical system:

$$ x_{t+1}=f(x_t,u_t,\theta), $$

where:

Symbol Meaning
$x_t$ system state
$u_t$ control input
$\theta$ physical or model parameters

A controller chooses actions

$$ u_t = \pi(x_t), $$

to optimize some objective.

The objective may involve trajectory tracking, energy consumption, stability, collision avoidance, or task completion. AD computes derivatives needed for optimization, planning, estimation, and learning.

Dynamics as Differentiable Programs

A robot simulator is a computational graph:

state
    -> dynamics
    -> integration
    -> contact resolution
    -> sensor model
    -> cost function

Differentiating this graph gives sensitivities of trajectories and objectives with respect to controls, parameters, or initial conditions.

This enables:

  • trajectory optimization,
  • model predictive control,
  • differentiable simulation,
  • policy learning,
  • system identification,
  • calibration,
  • inverse dynamics.

Equations of Motion

Rigid-body dynamics are commonly written as

$$ M(q)\ddot q + C(q,\dot q)\dot q + g(q)=\tau, $$

where:

Symbol Meaning
$q$ generalized coordinates
$M(q)$ mass matrix
$C(q,\dot q)$ Coriolis and centrifugal terms
$g(q)$ gravity
$\tau$ generalized forces

The state is typically

$$ x=(q,\dot q). $$

A simulator numerically integrates the resulting ODE or DAE.

AD computes derivatives of trajectories or costs with respect to:

  • torques,
  • masses,
  • inertias,
  • geometry,
  • friction coefficients,
  • controller parameters.

Trajectory Optimization

Trajectory optimization searches for a control sequence minimizing a cost:

$$ \min_{u_0,\ldots,u_{T-1}} \sum_{t=0}^{T-1} \ell(x_t,u_t) + \ell_T(x_T), $$

subject to dynamics constraints

$$ x_{t+1}=f(x_t,u_t). $$

The gradient of the total objective depends on derivatives propagated through the dynamics.

Reverse-mode AD naturally computes these gradients because the trajectory is a sequential computation graph.

Optimal Control and Adjoint Equations

Continuous-time optimal control uses dynamics

$$ \dot x=f(x,u), $$

and cost

$$ J = \int_0^T \ell(x(t),u(t)) dt + \phi(x(T)). $$

Pontryagin's principle introduces the adjoint state

$$ \lambda(t). $$

The Hamiltonian is

$$ H(x,u,\lambda) = \ell(x,u) + \lambda^\top f(x,u). $$

The adjoint equation is

$$ \dot\lambda = - \frac{\partial H}{\partial x}. $$

This is mathematically equivalent to reverse-mode differentiation through the trajectory.

The connection between control theory and reverse-mode AD is deep:

Control theory AD interpretation
Adjoint state Reverse gradient
Costate equation Backward sensitivity propagation
Hamiltonian derivatives Local Jacobian actions
Shooting method Gradient-based trajectory optimization

Model Predictive Control

Model predictive control repeatedly solves a finite-horizon optimization problem:

  1. estimate current state,
  2. optimize future controls,
  3. apply first action,
  4. repeat.

Each optimization uses system derivatives.

AD is useful because modern MPC systems may contain:

  • nonlinear dynamics,
  • learned components,
  • differentiable constraints,
  • neural cost terms,
  • differentiable collision models.

Instead of deriving gradients manually, the controller differentiates the simulation and objective directly.

Differentiable Simulation

A differentiable simulator exposes derivatives of simulated outcomes with respect to simulation inputs.

Suppose simulation evolves:

$$ x_{t+1}=\Phi(x_t,u_t,\theta). $$

A differentiable simulator computes:

$$ \frac{\partial x_T}{\partial u_t}, \qquad \frac{\partial x_T}{\partial \theta}, \qquad \frac{\partial L}{\partial x_t}. $$

Applications include:

Application Purpose
Robot design Optimize morphology
Policy learning Differentiate reward
System identification Fit physical parameters
Sim-to-real transfer Adapt simulator parameters
Grasp optimization Optimize contact behavior
Motion planning Optimize trajectories

Contact Dynamics

Contact is one of the hardest parts of differentiable robotics.

Rigid contact often introduces discontinuities:

$$ v^+ = R(v^-), $$

where collisions instantaneously change velocity.

Friction introduces complementarity conditions:

$$ 0 \le \lambda_n \perp \phi(q)\ge 0. $$

These systems are piecewise smooth or non-smooth.

Naive AD through contact solvers may produce:

  • undefined gradients,
  • unstable sensitivities,
  • zero gradients,
  • discontinuous optimization behavior.

Common strategies include:

Strategy Idea
Soft contact Replace hard contact with smooth penalty
Implicit differentiation Differentiate converged contact solve
Relaxed complementarity Smooth inequality conditions
Hybrid methods Analytical contact derivatives

Contact differentiation remains an active research area.

Inverse Kinematics

Inverse kinematics solves for joint angles producing a desired end-effector pose.

If

$$ x = f(q), $$

then inverse kinematics solves

$$ f(q)=x^*. $$

Optimization form:

$$ L(q)=\frac{1}{2}|f(q)-x^*|^2. $$

The Jacobian

$$ J(q)=\frac{\partial f}{\partial q} $$

maps joint velocities to task-space velocities:

$$ \dot x = J(q)\dot q. $$

AD computes these Jacobians automatically, especially for complex articulated systems.

State Estimation

Robotic systems estimate hidden states from sensor measurements.

A state estimator may combine:

  • inertial sensors,
  • cameras,
  • lidar,
  • wheel encoders,
  • GPS,
  • force sensors.

An optimization-based estimator minimizes residuals:

$$ L(x) = \sum_i |r_i(x)|^2. $$

AD computes Jacobians needed for Gauss-Newton or Levenberg-Marquardt optimization.

This is especially useful in SLAM and visual-inertial odometry, where residual structures are large and sparse.

Differentiable Perception

Modern robotic pipelines often integrate learned perception systems.

Example:

camera image
    -> neural perception model
    -> object pose estimate
    -> planner
    -> controller
    -> robot action

If the pipeline is differentiable end-to-end, gradients can flow from task objectives back into perception modules.

This enables:

  • task-aware perception,
  • differentiable sensor calibration,
  • policy gradients through perception,
  • learned observation models.

System Identification

System identification estimates physical parameters from observed trajectories.

Suppose a simulator predicts

$$ x_t(\theta). $$

Observed trajectories are

$$ \hat x_t. $$

The objective is

$$ L(\theta) = \sum_t |x_t(\theta)-\hat x_t|^2. $$

AD computes

$$ \nabla_\theta L. $$

This allows fitting:

  • masses,
  • friction coefficients,
  • motor constants,
  • damping,
  • actuator delays,
  • aerodynamic parameters.

Differentiable simulation has become a major tool for simulator calibration.

Reinforcement Learning and Control

Many reinforcement learning systems are differentiable control systems.

A policy

$$ u_t=\pi_\theta(x_t) $$

interacts with dynamics:

$$ x_{t+1}=f(x_t,u_t). $$

The objective is expected return:

$$ J(\theta) = \mathbb{E} \left[ \sum_t r(x_t,u_t) \right]. $$

If the environment is differentiable, gradients can propagate directly through the dynamics. This often gives lower-variance updates than score-function estimators.

However, real environments contain discontinuities, stochasticity, and unmodeled effects. Pure differentiable control is therefore usually combined with robust or stochastic methods.

Sparse Structure

Robotic systems have structured Jacobians.

Examples:

Structure Consequence
Kinematic trees Block-sparse derivatives
Local contacts Sparse coupling
Sequential dynamics Banded time structure
Factor graphs Sparse estimation systems

Efficient robotics AD systems exploit this sparsity rather than forming dense Jacobians.

Real-Time Constraints

Control systems often run under strict timing constraints.

Application Typical timing
Motor control microseconds to milliseconds
MPC milliseconds
Flight control sub-millisecond stability loops
SLAM updates real-time sensor rates

Generic AD frameworks may be too slow or memory-heavy.

Production systems often use:

  • custom derivative kernels,
  • ahead-of-time code generation,
  • symbolic simplification,
  • sparse linear algebra,
  • static computational graphs.

The derivative system must fit real-time constraints.

Numerical Stability

Robotics gradients can become unstable because of:

Cause Effect
Chaotic contact sequences Sensitive trajectories
Long horizons Exploding or vanishing gradients
Poor scaling Ill-conditioned optimization
Hard constraints Non-smooth derivatives
Integrator error Gradient mismatch
Solver tolerances Noisy adjoints

Good differentiable robotics systems carefully define solver semantics and smoothing behavior.

Differentiable Robot Design

Robot morphology itself can become an optimization variable.

Parameters may include:

  • link lengths,
  • masses,
  • actuator placement,
  • joint limits,
  • sensor locations.

A differentiable simulator computes how morphology affects task performance.

Optimization becomes:

$$ \min_\theta L(\theta), $$

where $\theta$ now defines robot structure rather than controller parameters.

This creates co-design systems where body and controller are optimized jointly.

Practical Architecture

A robust differentiable robotics stack typically separates:

Layer Responsibility
Geometry Kinematics and transforms
Dynamics Equations of motion
Contact Collision and friction
Integration Time stepping
Estimation Sensor fusion and optimization
Planning Trajectory optimization
Control Policy or feedback law
Learning Gradient-based adaptation

Each layer should expose well-defined derivative rules.

Failure Modes

Differentiable robotics systems fail in characteristic ways.

Failure mode Cause
Exploding trajectory gradients Long unstable horizons
Zero contact gradients Hard collision thresholds
Simulator mismatch Real-world physics differs
Memory explosion Reverse mode through long trajectories
Unstable optimization Ill-conditioned dynamics
Nonphysical learned behavior Weak constraints
Timing failure AD overhead violates real-time limits

Many practical systems intentionally smooth dynamics or truncate gradients to maintain optimization stability.

Summary

Robotics and control systems are naturally differentiable because they evolve through structured dynamical equations. Automatic differentiation provides gradients for trajectory optimization, control, estimation, simulation, and learning.

The main challenges are contact discontinuities, long-horizon stability, sparse structure, real-time execution, and differentiating through numerical solvers. Effective systems combine AD with optimal control theory, sparse numerical methods, custom solver derivatives, and carefully designed simulation semantics.