Chapter 57. Singular Value Decomposition
Chapter 57. Singular Value Decomposition
The singular value decomposition, usually called the SVD, is one of the central factorizations in linear algebra. It applies to every matrix, including rectangular and rank-deficient matrices. Unlike eigenvalue decomposition, it does not require the matrix to be square or diagonalizable.
The SVD expresses a matrix as
$$ A=U\Sigma V^*, $$
where:
| Factor | Property |
|---|---|
| (U) | Unitary or orthogonal |
| (\Sigma) | Diagonal with nonnegative entries |
| (V) | Unitary or orthogonal |
The diagonal entries of (\Sigma) are called the singular values of (A).
The SVD explains the geometry of a matrix completely. It shows that every linear transformation can be decomposed into:
- a rotation or unitary transformation,
- a coordinate scaling,
- another rotation or unitary transformation.
This factorization is fundamental in numerical linear algebra, data analysis, signal processing, statistics, optimization, machine learning, and inverse problems. Singular value decomposition factorizes any matrix into orthogonal or unitary factors together with a diagonal scaling matrix of singular values.
57.1 Definition of the SVD
Let
$$ A\in\mathbb{C}^{m\times n}. $$
A singular value decomposition of (A) is a factorization
$$ A=U\Sigma V^*, $$
where:
- (U\in\mathbb{C}^{m\times m}) is unitary,
- (V\in\mathbb{C}^{n\times n}) is unitary,
- (\Sigma\in\mathbb{R}^{m\times n}) is diagonal in the rectangular sense.
The diagonal entries satisfy
$$ \sigma_1\ge\sigma_2\ge\cdots\ge\sigma_r>0, $$
and all remaining diagonal entries are zero.
The numbers
$$ \sigma_1,\ldots,\sigma_r $$
are the singular values of (A).
The number (r) equals the rank of (A).
In the real case, the matrices (U) and (V) are orthogonal, and the conjugate transpose becomes an ordinary transpose:
$$ A=U\Sigma V^T. $$
57.2 Rectangular Diagonal Matrices
The matrix (\Sigma) has the form
$$ \Sigma= \begin{bmatrix} \sigma_1 & 0 & \cdots & 0\ 0 & \sigma_2 & \cdots & 0\ \vdots & \vdots & \ddots & \vdots\ 0 & 0 & \cdots & \sigma_r\ 0 & 0 & \cdots & 0 \end{bmatrix}. $$
If (m>n), then (\Sigma) has extra zero rows. If (n>m), then it has extra zero columns.
For example,
$$ \Sigma= \begin{bmatrix} 5&0\ 0&2\ 0&0 \end{bmatrix} $$
is a (3\times 2) diagonal matrix in the rectangular sense.
The singular values are always nonnegative.
57.3 Singular Values from Eigenvalues
The singular values of (A) are the square roots of the eigenvalues of
$$ A^*A. $$
Since
$$ A^*A $$
is Hermitian positive semidefinite, all its eigenvalues are real and nonnegative.
Suppose
$$ A^*Av=\lambda v. $$
Then
$$ v^*A^*Av = \lambda v^*v. $$
But
$$ v^*A^Av = (Av)^(Av) = |Av|_2^2\ge 0. $$
Thus
$$ \lambda\ge 0. $$
The singular values are defined by
$$ \sigma_i=\sqrt{\lambda_i}. $$
Therefore:
| Matrix | Spectrum |
|---|---|
| (A^*A) | Eigenvalues (\lambda_i\ge 0) |
| (A) | Singular values (\sigma_i=\sqrt{\lambda_i}) |
57.4 Right Singular Vectors
The eigenvectors of
$$ A^*A $$
are called the right singular vectors of (A).
Suppose
$$ A^*Av_i=\sigma_i^2v_i. $$
Choose orthonormal eigenvectors
$$ v_1,\ldots,v_n. $$
Collect them into the matrix
$$ V= \begin{bmatrix} |&|&&|\ v_1&v_2&\cdots&v_n\ |&|&&| \end{bmatrix}. $$
Then
$$ V^*V=I. $$
Thus (V) is unitary.
The matrix (V) gives the input coordinate directions of the transformation.
57.5 Left Singular Vectors
For every nonzero singular value (\sigma_i), define
$$ u_i=\frac{Av_i}{\sigma_i}. $$
Then
$$ Au_i? $$
No. The correct relation is
$$ Av_i=\sigma_i u_i. $$
The vectors (u_i) are called left singular vectors.
They satisfy
$$ A^*u_i=\sigma_i v_i. $$
Indeed,
$$ A^u_i = A^\frac{Av_i}{\sigma_i} = \frac{A^*Av_i}{\sigma_i} = \frac{\sigma_i^2v_i}{\sigma_i} = \sigma_i v_i. $$
The vectors (u_i) are orthonormal:
$$ u_i^u_j = \frac{1}{\sigma_i\sigma_j} (Av_i)^(Av_j). $$
Since
$$ (Av_i)^*(Av_j) = v_i^*A^*Av_j = \sigma_j^2v_i^*v_j, $$
we obtain
$$ u_i^*u_j=\delta_{ij}. $$
57.6 Constructing the SVD
Let
$$ A^A=V\Lambda V^ $$
be a unitary diagonalization.
The diagonal matrix (\Lambda) contains the eigenvalues
$$ \lambda_i=\sigma_i^2. $$
Define
$$ \Sigma= \operatorname{diag}(\sigma_1,\ldots,\sigma_r). $$
For each nonzero singular value, define
$$ u_i=\frac{Av_i}{\sigma_i}. $$
Extend the orthonormal set
$$ u_1,\ldots,u_r $$
to an orthonormal basis of (\mathbb{C}^m), and let
$$ U= \begin{bmatrix} |&|&&|\ u_1&u_2&\cdots&u_m\ |&|&&| \end{bmatrix}. $$
Then
$$ A=U\Sigma V^*. $$
Thus every matrix has an SVD.
57.7 Geometric Interpretation
The SVD describes the action of a matrix geometrically.
The transformation
$$ x\mapsto Ax $$
can be decomposed into three steps:
- apply (V^*),
- scale coordinates by (\Sigma),
- apply (U).
The matrices (U) and (V^*) are unitary or orthogonal, so they preserve length and angle.
The matrix (\Sigma) stretches space independently along orthogonal coordinate directions.
Thus the singular values measure how strongly the matrix stretches different directions.
57.8 Unit Sphere Interpretation
Consider the unit sphere
$$ {x:|x|_2=1}. $$
Under the transformation (x\mapsto Ax), the sphere becomes an ellipsoid.
The principal axes of this ellipsoid are:
| Quantity | Meaning |
|---|---|
| (v_i) | Input direction |
| (u_i) | Output direction |
| (\sigma_i) | Stretch factor |
Specifically,
$$ Av_i=\sigma_i u_i. $$
Thus the matrix stretches the direction (v_i) by the factor (\sigma_i), then rotates it into the direction (u_i).
This geometric interpretation is one of the most important meanings of the SVD.
57.9 Rank and Singular Values
The rank of (A) equals the number of nonzero singular values.
Indeed,
$$ Av_i=\sigma_i u_i. $$
If
$$ \sigma_i=0, $$
then
$$ Av_i=0, $$
so (v_i) lies in the null space of (A).
Thus:
| Quantity | Equals |
|---|---|
| Rank of (A) | Number of positive singular values |
| Nullity of (A) | Number of zero singular values |
The SVD therefore gives complete rank information.
57.10 Compact SVD
If (A) has rank (r), one may keep only the nonzero singular values.
Define:
$$ U_r= \begin{bmatrix} u_1&\cdots&u_r \end{bmatrix}, $$
$$ V_r= \begin{bmatrix} v_1&\cdots&v_r \end{bmatrix}, $$
and
$$ \Sigma_r= \operatorname{diag}(\sigma_1,\ldots,\sigma_r). $$
Then
$$ A=U_r\Sigma_rV_r^*. $$
This is the compact SVD or thin SVD.
It removes directions associated with zero singular values.
57.11 Outer Product Expansion
The SVD can be written as
$$ A = \sum_{i=1}^r \sigma_i u_i v_i^*. $$
Each term
$$ u_i v_i^* $$
is a rank-one matrix.
Thus every matrix is a sum of rank-one matrices weighted by singular values.
This decomposition is fundamental in low-rank approximation and data compression.
57.12 Best Rank-(k) Approximation
One of the most important theorems about the SVD is the Eckart-Young theorem.
Suppose
$$ A = \sum_{i=1}^r \sigma_i u_i v_i^*. $$
Define
$$ A_k = \sum_{i=1}^k \sigma_i u_i v_i^*. $$
Then (A_k) is the best rank-(k) approximation to (A) in both the spectral norm and Frobenius norm.
The approximation error is
$$ |A-A_k|2=\sigma{k+1}. $$
Thus truncating the SVD gives the optimal low-rank approximation.
This result is central in compression, PCA, latent semantic analysis, and recommender systems.
57.13 Example
Consider
$$ A= \begin{bmatrix} 3&0\ 0&1 \end{bmatrix}. $$
Then
$$ A^TA= \begin{bmatrix} 9&0\ 0&1 \end{bmatrix}. $$
The eigenvalues are
$$ 9 \quad \text{and} \quad 1. $$
Thus the singular values are
$$ \sigma_1=3, \qquad \sigma_2=1. $$
The standard basis vectors are eigenvectors, so
$$ V=I, \qquad U=I. $$
Hence
$$ A=I \begin{bmatrix} 3&0\ 0&1 \end{bmatrix} I. $$
This matrix stretches the (x)-direction by (3) and the (y)-direction by (1).
57.14 Pseudoinverse
The SVD gives the Moore-Penrose pseudoinverse.
Suppose
$$ A=U\Sigma V^*. $$
Define
$$ \Sigma^+ $$
by replacing each nonzero singular value (\sigma_i) by
$$ \frac{1}{\sigma_i} $$
and transposing the rectangular structure.
Then
$$ A^+=V\Sigma^+U^*. $$
The pseudoinverse gives:
| Problem | Solution |
|---|---|
| Least squares | (x=A^+b) |
| Minimum norm solution | (x=A^+b) |
| Projection | (AA^+) |
The pseudoinverse exists for every matrix.
57.15 Least Squares and the SVD
Suppose
$$ A=U\Sigma V^*. $$
The least squares problem
$$ \min_x|Ax-b|_2 $$
becomes
$$ \min_x|\Sigma V^*x-U^*b|_2. $$
Since (U) and (V) are unitary, they preserve norms.
The problem reduces to a diagonal system involving singular values.
If all singular values are positive, the least squares solution is
$$ x = V\Sigma^{-1}U^*b. $$
This equals
$$ A^+b. $$
The SVD therefore solves least squares problems robustly, even for ill-conditioned or rank-deficient matrices.
57.16 Condition Number
The condition number of a full-rank matrix in the Euclidean norm is
$$ \kappa_2(A) = \frac{\sigma_1}{\sigma_r}, $$
where:
- (\sigma_1) is the largest singular value,
- (\sigma_r) is the smallest positive singular value.
A large condition number indicates near singularity and numerical instability.
The singular values therefore measure how close a matrix is to losing rank.
57.17 Spectral Norm
The operator (2)-norm of a matrix is
$$ |A|2 = \max{|x|_2=1}|Ax|_2. $$
This equals the largest singular value:
$$ |A|_2=\sigma_1. $$
Thus the largest singular value measures the maximum stretching factor of the transformation.
57.18 Frobenius Norm
The Frobenius norm is
$$ |A|F = \sqrt{ \sum{i,j}|a_{ij}|^2 }. $$
Using the SVD,
$$ |A|F^2 = \sum{i=1}^r \sigma_i^2. $$
Thus the singular values completely determine the Frobenius norm.
57.19 Principal Component Analysis
Principal component analysis, usually called PCA, is closely related to the SVD.
Suppose a data matrix (X) contains centered observations.
The covariance matrix is
$$ X^TX. $$
The eigenvectors of (X^TX) are principal directions.
If
$$ X=U\Sigma V^T, $$
then:
| Quantity | PCA interpretation |
|---|---|
| (v_i) | Principal direction |
| (\sigma_i^2) | Variance magnitude |
| (u_i\sigma_i) | Principal component coordinates |
Thus PCA is essentially the SVD of the centered data matrix.
57.20 Numerical Importance
The SVD is one of the most stable and informative matrix factorizations.
It is used for:
| Application | Purpose |
|---|---|
| Least squares | Stable solution |
| Rank estimation | Detect numerical rank |
| Compression | Low-rank approximation |
| PCA | Dimension reduction |
| Signal processing | Noise filtering |
| Machine learning | Feature extraction |
| Inverse problems | Regularization |
| Numerical analysis | Conditioning analysis |
Unlike many decompositions, the SVD exists for every matrix and reveals complete geometric information about the transformation.
57.21 Summary
Every matrix admits a singular value decomposition:
$$ A=U\Sigma V^*. $$
The matrices (U) and (V) are unitary or orthogonal. The matrix (\Sigma) contains the nonnegative singular values.
The singular values are the square roots of the eigenvalues of
$$ A^*A. $$
The SVD describes the action of a matrix geometrically as:
- a unitary transformation,
- a coordinate scaling,
- another unitary transformation.
It reveals rank, conditioning, principal directions, and optimal low-rank approximations. It is one of the most powerful tools in linear algebra and numerical computation.