Chapter 38. Projection Operators

A projection operator is a linear operator that leaves its own output unchanged. If (V) is a vector space, a projection on (V) is a linear map

$$ P:V\to V $$

such that

$$ P^2=P. $$

Equivalently,

$$ P(P(v))=P(v) $$

for every (v\in V). Applying the projection once moves a vector into the target subspace. Applying it again has no further effect. This property is called idempotence. Projection operators are therefore exactly idempotent linear operators.

38.1 First Example

Define

$$ P:\mathbb{R}^3\to\mathbb{R}^3 $$

$$ P \begin{bmatrix} x\ y\ z \end{bmatrix} = \begin{bmatrix} x\ y\ 0 \end{bmatrix}. $$

This map sends every vector to its shadow on the (xy)-plane.

Apply (P) twice:

$$ P^2 \begin{bmatrix} x\ y\ z \end{bmatrix} = P \begin{bmatrix} x\ y\ 0 \end{bmatrix} = \begin{bmatrix} x\ y\ 0 \end{bmatrix}. $$

Thus

$$ P^2=P. $$

So (P) is a projection operator.

The image is the (xy)-plane:

$$ \operatorname{im}(P)= \left{ \begin{bmatrix} x\ y\ 0 \end{bmatrix} :x,y\in\mathbb{R} \right}. $$

The kernel is the (z)-axis:

$$ \ker(P)= \left{ \begin{bmatrix} 0\ 0\ z \end{bmatrix} :z\in\mathbb{R} \right}. $$

The projection keeps the image and removes the kernel.

38.2 Idempotence

The defining equation

$$ P^2=P $$

means

$$ P(P(v))=P(v) $$

for every vector (v\in V).

This has a direct interpretation. The first application of (P) sends (v) into (\operatorname{im}(P)). Once a vector is in (\operatorname{im}(P)), the projection leaves it fixed.

Indeed, if (w\in\operatorname{im}(P)), then there is some (v\in V) such that

$$ w=P(v). $$

Then

$$ P(w)=P(P(v))=P^2(v)=P(v)=w. $$

Thus every vector in the image is fixed by (P).

Conversely, if

$$ P(w)=w, $$

then (w\in\operatorname{im}(P)), since (w) is the image of itself under (P). Hence

$$ \operatorname{im}(P)={w\in V:P(w)=w}. $$

The image of a projection is exactly its fixed subspace.

38.3 Kernel and Image

For every projection (P:V\to V), the kernel and image describe the whole operator.

The kernel is

$$ \ker(P)={v\in V:P(v)=0}. $$

The image is

$$ \operatorname{im}(P)={P(v):v\in V}. $$

Every vector (v\in V) can be decomposed as

$$ v=P(v)+(v-P(v)). $$

The first term satisfies

$$ P(v)\in\operatorname{im}(P). $$

The second term lies in the kernel, because

$$ P(v-P(v))=P(v)-P^2(v)=P(v)-P(v)=0. $$

Therefore

$$ v=P(v)+(v-P(v)) $$

is a decomposition of (v) into a part in the image and a part in the kernel.

Moreover, this decomposition is unique. If

$$ u\in\operatorname{im}(P)\cap\ker(P), $$

then (u\in\operatorname{im}(P)) implies

$$ P(u)=u. $$

But (u\in\ker(P)) implies

$$ P(u)=0. $$

Thus

$$ u=0. $$

$$ \operatorname{im}(P)\cap\ker(P)={0}. $$

Hence

$$ V=\operatorname{im}(P)\oplus\ker(P). $$

Every projection gives a direct sum decomposition of the vector space.

38.4 Projection Onto a Subspace Along a Complement

Let (V) be a vector space, and suppose

$$ V=U\oplus W. $$

This means every vector (v\in V) has a unique decomposition

$$ v=u+w, $$

where

$$ u\in U, \qquad w\in W. $$

Define

$$ P:V\to V $$

$$ P(v)=u. $$

That is, (P) keeps the (U)-component and discards the (W)-component.

Then (P) is linear. If

$$ v_1=u_1+w_1, \qquad v_2=u_2+w_2, $$

then

$$ v_1+v_2=(u_1+u_2)+(w_1+w_2), $$

$$ P(v_1+v_2)=u_1+u_2=P(v_1)+P(v_2). $$

For a scalar (c),

$$ cv=cu+cw, $$

$$ P(cv)=cu=cP(v). $$

Also,

$$ P^2(v)=P(u)=u=P(v). $$

Thus (P) is a projection.

Its image is (U), and its kernel is (W):

$$ \operatorname{im}(P)=U, \qquad \ker(P)=W. $$

So a projection is the same as a choice of direct sum decomposition.

38.5 The Complementary Projection

If (P) is a projection on (V), then

$$ I-P $$

is also a projection.

Compute:

$$ (I-P)^2=I-2P+P^2. $$

Since

$$ P^2=P, $$

we get

$$ (I-P)^2=I-2P+P=I-P. $$

Thus (I-P) is idempotent.

The projection (I-P) keeps the part that (P) removes. For any vector (v),

$$ v=P(v)+(I-P)(v). $$

The image of (I-P) is the kernel of (P):

$$ \operatorname{im}(I-P)=\ker(P). $$

The kernel of (I-P) is the image of (P):

$$ \ker(I-P)=\operatorname{im}(P). $$

Thus (P) and (I-P) are complementary projections.

38.6 Matrix Projections

A square matrix (P) is called a projection matrix if

$$ P^2=P. $$

Such a matrix defines a projection operator

$$ x\mapsto Px. $$

For example,

$$ P= \begin{bmatrix} 1 & 0\ 0 & 0 \end{bmatrix} $$

satisfies

$$ P^2= \begin{bmatrix} 1 & 0\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 1 & 0\ 0 & 0 \end{bmatrix} =P. $$

It projects (\mathbb{R}^2) onto the (x)-axis:

$$ P \begin{bmatrix} x\ y \end{bmatrix} = \begin{bmatrix} x\ 0 \end{bmatrix}. $$

A projection matrix must be square because it represents a linear operator from a vector space to itself.

38.7 Orthogonal Projections

In an inner product space, an important special case is the orthogonal projection.

An orthogonal projection onto a subspace (U) writes each vector (v) as

$$ v=u+w, $$

where

$$ u\in U $$

and

$$ w\in U^\perp. $$

Then

$$ P(v)=u. $$

The vector (u) is the closest vector in (U) to (v). The difference

$$ v-P(v) $$

is perpendicular to (U).

For example, the projection from (\mathbb{R}^3) onto the (xy)-plane is orthogonal because the removed part lies on the (z)-axis, which is perpendicular to the plane.

In real coordinate spaces, an orthogonal projection matrix satisfies

$$ P^2=P $$

and

$$ P^T=P. $$

Thus it is both idempotent and symmetric. For complex spaces, symmetry is replaced by self-adjointness:

$$ P^*=P. $$

Orthogonal projection matrices are therefore characterized by

$$ P^2=P=P^T $$

in the real case, and

$$ P^2=P=P^* $$

in the complex case.

38.8 Projection Onto a Line

Let (u\in\mathbb{R}^n) be a nonzero vector. The orthogonal projection of (x\in\mathbb{R}^n) onto the line spanned by (u) is

$$ \operatorname{proj}_u(x)= \frac{x\cdot u}{u\cdot u}u. $$

The scalar

$$ \frac{x\cdot u}{u\cdot u} $$

is the coordinate of the projection along (u).

The corresponding matrix is

$$ P=\frac{uu^T}{u^Tu}. $$

Indeed,

$$ Px= \frac{uu^T}{u^Tu}x = u\frac{u^Tx}{u^Tu} = \frac{x\cdot u}{u\cdot u}u. $$

This matrix is symmetric and idempotent, so it is an orthogonal projection matrix.

38.9 Example: Projection Onto a Line in (\mathbb{R}^2)

Let

$$ u= \begin{bmatrix} 1\ 2 \end{bmatrix}. $$

Then

$$ u^Tu=1^2+2^2=5. $$

Also,

$$ uu^T= \begin{bmatrix} 1\ 2 \end{bmatrix} \begin{bmatrix} 1 & 2 \end{bmatrix} = \begin{bmatrix} 1 & 2\ 2 & 4 \end{bmatrix}. $$

Therefore the projection matrix onto the line spanned by (u) is

$$ P= \frac{1}{5} \begin{bmatrix} 1 & 2\ 2 & 4 \end{bmatrix}. $$

For

$$ x= \begin{bmatrix} 3\ 1 \end{bmatrix}, $$

we get

$$ Px= \frac{1}{5} \begin{bmatrix} 1 & 2\ 2 & 4 \end{bmatrix} \begin{bmatrix} 3\ 1 \end{bmatrix} = \frac{1}{5} \begin{bmatrix} 5\ 10 \end{bmatrix} = \begin{bmatrix} 1\ 2 \end{bmatrix}. $$

The vector (x) projects exactly onto (u). The error vector is

$$ x-Px= \begin{bmatrix} 3\ 1 \end{bmatrix} - \begin{bmatrix} 1\ 2 \end{bmatrix} = \begin{bmatrix} 2\ -1 \end{bmatrix}. $$

Check orthogonality:

$$ \begin{bmatrix} 2\ -1 \end{bmatrix} \cdot \begin{bmatrix} 1\ 2 \end{bmatrix} = 2-2=0. $$

The error is perpendicular to the line.

38.10 Projection Onto a Subspace with an Orthonormal Basis

Let (U) be a subspace of (\mathbb{R}^n), and let

$$ q_1,\ldots,q_k $$

be an orthonormal basis of (U). Then the orthogonal projection of (x) onto (U) is

$$ P_Ux=(x\cdot q_1)q_1+\cdots+(x\cdot q_k)q_k. $$

If (Q) is the matrix with columns

$$ q_1,\ldots,q_k, $$

then

$$ Q^TQ=I_k. $$

The projection matrix is

$$ P=QQ^T. $$

Indeed,

$$ QQ^Tx = Q \begin{bmatrix} q_1^Tx\ \vdots\ q_k^Tx \end{bmatrix} = (q_1^Tx)q_1+\cdots+(q_k^Tx)q_k. $$

The matrix (QQ^T) is symmetric and idempotent:

$$ (QQ^T)^T=QQ^T, $$

and

$$ (QQ^T)^2=Q(Q^TQ)Q^T=QIQ^T=QQ^T. $$

38.11 Projection Onto a Column Space

Let (A) be an (m\times k) matrix with linearly independent columns. The column space of (A) is a (k)-dimensional subspace of (\mathbb{R}^m).

The orthogonal projection onto (\operatorname{col}(A)) is

$$ P=A(A^TA)^{-1}A^T. $$

This formula generalizes the line projection formula. When (A) has one column (u), it becomes

$$ P=u(u^Tu)^{-1}u^T = \frac{uu^T}{u^Tu}. $$

The matrix (A^TA) is invertible because the columns of (A) are linearly independent.

The projection (Px) is the vector in (\operatorname{col}(A)) closest to (x), and the residual

$$ x-Px $$

is orthogonal to every column of (A). Projection formulas of this kind are central in least squares and regression.

38.12 Derivation of the Column Space Formula

We seek a vector in (\operatorname{col}(A)) closest to (x). Such a vector has the form

$$ A\hat c $$

for some coefficient vector (\hat c).

The residual is

$$ r=x-A\hat c. $$

For (A\hat c) to be the orthogonal projection, (r) must be orthogonal to every column of (A). This condition is

$$ A^T(x-A\hat c)=0. $$

Expanding gives

$$ A^Tx-A^TA\hat c=0. $$

$$ A^TA\hat c=A^Tx. $$

Since (A^TA) is invertible,

$$ \hat c=(A^TA)^{-1}A^Tx. $$

Therefore

$$ Px=A\hat c=A(A^TA)^{-1}A^Tx. $$

Thus

$$ P=A(A^TA)^{-1}A^T. $$

38.13 Oblique Projections

A projection need not be orthogonal.

Suppose

$$ V=U\oplus W. $$

The projection onto (U) along (W) sends

$$ u+w $$

$$ u. $$

If (W=U^\perp), the projection is orthogonal. If (W) is another complement, the projection is oblique.

An oblique projection still satisfies

$$ P^2=P. $$

It still has

$$ \operatorname{im}(P)=U, \qquad \ker(P)=W. $$

But the removed part (w) may not be perpendicular to (U). Oblique projections are therefore algebraically valid projections, but they do not describe nearest-point projection in the usual Euclidean metric.

38.14 Example of an Oblique Projection

In (\mathbb{R}^2), let

$$ U=\operatorname{span} \left{ \begin{bmatrix} 1\ 0 \end{bmatrix} \right} $$

be the (x)-axis, and let

$$ W=\operatorname{span} \left{ \begin{bmatrix} 1\ 1 \end{bmatrix} \right}. $$

Every vector

$$ \begin{bmatrix} x\ y \end{bmatrix} $$

can be written uniquely as

$$ \begin{bmatrix} x\ y \end{bmatrix} = \begin{bmatrix} a\ 0 \end{bmatrix} + t \begin{bmatrix} 1\ 1 \end{bmatrix}. $$

From the second coordinate,

$$ t=y. $$

From the first coordinate,

$$ a+t=x, $$

$$ a=x-y. $$

Thus the projection onto (U) along (W) is

$$ P \begin{bmatrix} x\ y \end{bmatrix} = \begin{bmatrix} x-y\ 0 \end{bmatrix}. $$

Its matrix is

$$ P= \begin{bmatrix} 1 & -1\ 0 & 0 \end{bmatrix}. $$

Check idempotence:

$$ P^2= \begin{bmatrix} 1 & -1\ 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & -1\ 0 & 0 \end{bmatrix} = \begin{bmatrix} 1 & -1\ 0 & 0 \end{bmatrix} =P. $$

This projection is not orthogonal because

$$ P^T\neq P. $$

It projects onto the (x)-axis along diagonal lines parallel to ((1,1)), not along vertical lines.

38.15 Eigenvalues of a Projection

Let (P) be a projection. If (v) is an eigenvector with eigenvalue (\lambda), then

$$ P(v)=\lambda v. $$

Apply (P) again:

$$ P^2(v)=P(\lambda v)=\lambda P(v)=\lambda^2v. $$

But (P^2=P), so

$$ P^2(v)=P(v)=\lambda v. $$

Hence

$$ \lambda^2v=\lambda v. $$

Since (v\neq 0),

$$ \lambda^2=\lambda. $$

Thus

$$ \lambda(\lambda-1)=0. $$

Therefore every eigenvalue of a projection is either

$$ 0 $$

$$ 1. $$

Vectors in the kernel have eigenvalue (0). Vectors in the image have eigenvalue (1). Projection matrices have only the eigenvalues (0) and (1).

38.16 Diagonal Form

Because the minimal polynomial of a projection divides

$$ x^2-x=x(x-1), $$

and this polynomial has distinct roots, every projection on a finite-dimensional vector space is diagonalizable.

More concretely, choose a basis

$$ (u_1,\ldots,u_r) $$

for (\operatorname{im}(P)), and choose a basis

$$ (w_1,\ldots,w_s) $$

for (\ker(P)).

Since

$$ V=\operatorname{im}(P)\oplus\ker(P), $$

the combined list

$$ (u_1,\ldots,u_r,w_1,\ldots,w_s) $$

is a basis of (V).

In this basis, (P) has matrix

$$ \begin{bmatrix} I_r & 0\ 0 & 0 \end{bmatrix}. $$

Thus a projection is structurally simple: it is identity on one subspace and zero on a complementary subspace.

38.17 Trace and Rank

For a finite-dimensional projection (P), the trace equals the rank.

In the basis adapted to

$$ V=\operatorname{im}(P)\oplus\ker(P), $$

the matrix of (P) is

$$ \begin{bmatrix} I_r & 0\ 0 & 0 \end{bmatrix}, $$

where

$$ r=\dim(\operatorname{im}(P)). $$

The trace is the sum of diagonal entries:

$$ \operatorname{tr}(P)=r. $$

The rank is also

$$ \operatorname{rank}(P)=r. $$

Therefore

$$ \operatorname{tr}(P)=\operatorname{rank}(P). $$

This fact is often useful in matrix analysis, statistics, and numerical linear algebra.

38.18 Products of Projections

The product of two projections is not necessarily a projection.

Let (P) and (Q) be projections. The product (PQ) is a projection if

$$ (PQ)^2=PQ. $$

Compute:

$$ (PQ)^2=PQPQ. $$

If (P) and (Q) commute, meaning

$$ PQ=QP, $$

then

$$ (PQ)^2=PQPQ=PPQQ=P^2Q^2=PQ. $$

Thus, if two projections commute, their product is also a projection.

Without commutativity, the product may fail to be idempotent. Products of projections therefore require care.

38.19 Projections and Least Squares

Projection operators are the algebraic core of least squares.

Given an inconsistent system

$$ Ax=b, $$

there may be no exact solution. Instead, least squares seeks a vector (\hat x) such that

$$ A\hat x $$

is as close as possible to (b) inside the column space of (A).

This means

$$ A\hat x=P_{\operatorname{col}(A)}b, $$

where

$$ P_{\operatorname{col}(A)} $$

is the orthogonal projection onto the column space of (A).

The normal equations

$$ A^TA\hat x=A^Tb $$

come from the orthogonality condition

$$ b-A\hat x\perp\operatorname{col}(A). $$

Thus least squares is not merely an approximation trick. It is an orthogonal projection problem.

38.20 Summary

A projection operator is a linear operator

$$ P:V\to V $$

satisfying

$$ P^2=P. $$

This means applying (P) twice is the same as applying it once.

Every projection decomposes the vector space as

$$ V=\operatorname{im}(P)\oplus\ker(P). $$

It acts as the identity on its image and as zero on its kernel.

If (V=U\oplus W), then the projection onto (U) along (W) is the map

$$ u+w\mapsto u. $$

Orthogonal projections occur in inner product spaces when the complement is perpendicular to the target subspace. In real coordinates, an orthogonal projection matrix satisfies

$$ P^2=P=P^T. $$

Projection matrices have eigenvalues only (0) and (1), are diagonalizable, and satisfy

$$ \operatorname{tr}(P)=\operatorname{rank}(P). $$

Projection operators are central in geometry, least squares, regression, numerical linear algebra, and the study of decompositions of vector spaces.