Chapter 95. Sparse Matrices

Chapter 95. Sparse Matrices

A sparse matrix is a matrix in which most entries are zero. Sparse matrices occur when a system has many variables but only a small number of direct interactions between them.

For example, a finite difference model on a grid usually couples each grid point only to its nearest neighbors. A graph adjacency matrix usually stores only existing edges. A finite element stiffness matrix records local interactions between basis functions. In all of these cases, storing every zero entry wastes memory and computation.

Sparse linear algebra studies how to store, multiply, factor, reorder, and solve with such matrices efficiently. Specialized sparse representations store only nonzero entries, and sparse algorithms are designed to avoid unnecessary arithmetic with zeros.

95.1 Dense and Sparse Matrices

A dense matrix is stored as a full rectangular array. An (m\times n) dense matrix stores

$$ mn $$

entries.

A sparse matrix stores mainly its nonzero entries. If a matrix has

$$ \operatorname{nnz}(A) $$

nonzero entries, then sparse storage aims to use memory proportional to

$$ \operatorname{nnz}(A), $$

plus indexing overhead.

The symbol

$$ \operatorname{nnz}(A) $$

means the number of nonzero entries of (A).

A matrix is useful to treat as sparse when

$$ \operatorname{nnz}(A) \ll mn. $$

The exact boundary is practical, not purely mathematical. It depends on matrix size, storage format, hardware, and the operations to be performed.

95.2 Examples

A diagonal matrix is sparse:

$$ D= \begin{bmatrix} d_1 & 0 & 0 & \cdots & 0\ 0 & d_2 & 0 & \cdots & 0\ 0 & 0 & d_3 & \cdots & 0\ \vdots & \vdots & \vdots & \ddots & \vdots\ 0 & 0 & 0 & \cdots & d_n \end{bmatrix}. $$

It has only (n) nonzero entries out of (n^2).

A tridiagonal matrix is also sparse:

$$ T= \begin{bmatrix} d_1 & u_1 & 0 & \cdots & 0\ \ell_1 & d_2 & u_2 & \cdots & 0\ 0 & \ell_2 & d_3 & \cdots & 0\ \vdots & \vdots & \vdots & \ddots & u_{n-1}\ 0 & 0 & 0 & \ell_{n-1} & d_n \end{bmatrix}. $$

It has at most

$$ 3n-2 $$

nonzero entries.

For large (n), this is far smaller than (n^2).

95.3 Sparsity Pattern

The sparsity pattern of a matrix records which entries are structurally nonzero.

For a matrix (A), the sparsity pattern is the set

$$ \mathcal{S}(A) = {(i,j): a_{ij}\ne 0}. $$

The numerical values may change, but the pattern may remain fixed.

This distinction is important. Many sparse algorithms first analyze the pattern, then perform numerical computation. For example, sparse direct solvers often separate symbolic factorization from numerical factorization.

Phase Purpose
Symbolic phase analyze sparsity, ordering, fill-in, memory
Numerical phase compute actual numerical factors

The symbolic phase depends mostly on the zero-nonzero structure. The numerical phase depends on the values.

95.4 Density and Sparsity

The density of a matrix is

$$ \frac{\operatorname{nnz}(A)}{mn}. $$

The sparsity is often described as

$$ 1-\frac{\operatorname{nnz}(A)}{mn}. $$

For example, if a (100000\times 100000) matrix has (500000) nonzero entries, then its density is

$$ \frac{500000}{10^{10}}=5\times 10^{-5}. $$

This matrix is extremely sparse.

A dense storage format would store (10^{10}) entries. Sparse storage would store roughly (500000) values plus indices.

95.5 Why Sparse Matrices Matter

Sparse matrices matter because dense methods often become impossible for large problems.

A dense (n\times n) matrix needs memory proportional to

$$ n^2. $$

Dense factorization usually costs proportional to

$$ n^3. $$

For (n=10^6), dense storage is not practical.

Sparse methods exploit structure. If each row has only a fixed number of nonzero entries, then

$$ \operatorname{nnz}(A)=O(n). $$

A matrix-vector product may then cost

$$ O(n) $$

instead of

$$ O(n^2). $$

This difference is decisive in large-scale computation.

95.6 Common Sources of Sparse Matrices

Sparse matrices arise naturally in many areas.

Source Matrix
Graph theory adjacency matrix, Laplacian matrix
Finite differences discretized differential operators
Finite elements stiffness and mass matrices
Optimization sparse Hessians and Jacobians
Machine learning sparse feature matrices
Networks incidence and transition matrices
Recommender systems user-item interaction matrices
Text analysis document-term matrices

The common feature is locality or limited interaction.

Most variables interact with only a few other variables.

95.7 Graph Interpretation

A sparse matrix can be interpreted as a graph.

For a square matrix (A), define a directed graph with vertices

$$ 1,2,\ldots,n. $$

There is an edge

$$ i\to j $$

when

$$ a_{ij}\ne 0. $$

For a symmetric matrix, the graph is undirected because

$$ a_{ij}\ne 0 $$

implies

$$ a_{ji}\ne 0. $$

This graph viewpoint is useful for reordering, fill-in analysis, graph Laplacians, and sparse factorization.

95.8 Coordinate Format

The coordinate format, or COO format, stores a sparse matrix as triples:

$$ (i,j,a_{ij}). $$

For example,

$$ A= \begin{bmatrix} 10 & 0 & 0\ 0 & 0 & 5\ 2 & 0 & 3 \end{bmatrix} $$

can be stored as:

Row Column Value
1 1 10
2 3 5
3 1 2
3 3 3

COO is simple and convenient for construction.

It is less efficient for repeated arithmetic because entries must often be sorted or compressed before fast operations.

95.9 Compressed Sparse Row Format

Compressed sparse row format, or CSR, stores a sparse matrix row by row using three arrays:

Array Meaning
values nonzero values
col_index column index for each value
row_ptr starting position of each row

CSR is one of the most common sparse formats. It supports fast row access and efficient matrix-vector multiplication. Standard descriptions of CSR use three one-dimensional arrays: values, column indices, and row pointers.

For the matrix

$$ A= \begin{bmatrix} 10 & 0 & 20 & 0\ 0 & 30 & 0 & 40\ 0 & 0 & 50 & 0 \end{bmatrix}, $$

CSR storage is:

$$ \text{values}=[10,20,30,40,50], $$

$$ \text{col_index}=[1,3,2,4,3], $$

$$ \text{row_ptr}=[1,3,5,6]. $$

Using zero-based indexing, the same data would be

$$ \text{col_index}=[0,2,1,3,2], $$

$$ \text{row_ptr}=[0,2,4,5]. $$

The row pointer tells where each row begins and ends in the values array.

95.10 Reading a CSR Row

In zero-based indexing, row (i) occupies entries

$$ \text{row_ptr}[i], \text{row_ptr}[i]+1, \ldots, \text{row_ptr}[i+1]-1. $$

Thus the nonzeros in row (i) are stored in the slice

$$ \text{values}[\text{row_ptr}[i] : \text{row_ptr}[i+1]]. $$

The corresponding columns are stored in

$$ \text{col_index}[\text{row_ptr}[i] : \text{row_ptr}[i+1]]. $$

This compact row access is why CSR is well suited to matrix-vector multiplication.

95.11 Compressed Sparse Column Format

Compressed sparse column format, or CSC, is the column-oriented analogue of CSR.

It stores:

Array Meaning
values nonzero values
row_index row index for each value
col_ptr starting position of each column

CSC is efficient for column access.

It is commonly used in sparse direct solvers because column-oriented operations are natural in many factorizations. MATLAB's sparse format is traditionally column-oriented.

CSR and CSC store the same mathematical matrix, but they favor different operations.

Format Efficient access
CSR rows
CSC columns

95.12 Other Sparse Formats

Several formats are used for different purposes.

Format Use
COO easy construction from triples
CSR row operations and matrix-vector products
CSC column operations and direct solvers
DOK incremental insertion by key
LIL incremental row-wise construction
DIA diagonal or banded matrices
ELL regular row lengths, useful on some parallel hardware
BSR block sparse matrices

Formats such as DOK and LIL are convenient for building sparse matrices, while CSR and CSC are usually better for arithmetic.

95.13 Sparse Matrix-Vector Multiplication

The most important sparse operation is

$$ y=Ax. $$

In CSR format, this is computed row by row:

spmv_csr(values, col_index, row_ptr, x):
    m = length(row_ptr) - 1
    y = zero vector of length m

    for i = 0 to m - 1:
        s = 0

        for p = row_ptr[i] to row_ptr[i + 1] - 1:
            j = col_index[p]
            s = s + values[p] * x[j]

        y[i] = s

    return y

The cost is proportional to the number of stored nonzeros:

$$ O(\operatorname{nnz}(A)). $$

This is the core operation in many iterative methods, including conjugate gradient, GMRES, Lanczos iteration, and power iteration.

95.14 Sparse Matrix Addition

To compute

$$ C=A+B, $$

the nonzero patterns of (A) and (B) must be merged.

If both matrices are stored row-wise, then addition can be performed row by row.

For each row, merge the column indices from both matrices. When the same column occurs in both rows, add the values. When an entry appears in only one matrix, copy it.

The resulting pattern is contained in

$$ \mathcal{S}(A)\cup \mathcal{S}(B). $$

Some numerical cancellations may produce zero values, which should often be removed.

95.15 Sparse Matrix-Matrix Multiplication

Sparse matrix-matrix multiplication computes

$$ C=AB. $$

The sparsity pattern of (C) may be much larger than the patterns of (A) and (B).

Even if (A) and (B) are sparse, (C) may be dense.

This is one reason sparse matrix multiplication is more complicated than sparse matrix-vector multiplication.

The pattern is determined by paths of length two in the associated graph:

$$ c_{ij}\ne 0 $$

may occur when there exists (k) such that

$$ a_{ik}\ne 0 \qquad \text{and} \qquad b_{kj}\ne 0. $$

Efficient sparse multiplication must handle both symbolic pattern construction and numerical accumulation.

95.16 Fill-In

Fill-in refers to new nonzero entries that appear during factorization.

For example, a zero entry of (A) may become nonzero in an LU or Cholesky factor.

Sparse direct methods try to reduce fill-in because fill-in increases both memory and arithmetic cost. Reordering rows and columns is a common way to reduce fill-in.

Fill-in explains why a sparse matrix may still be difficult to factor.

The original matrix may be sparse, but its factors may be much denser.

95.17 Sparse LU Factorization

Sparse LU factorization writes

$$ PAQ=LU $$

or

$$ PA=LU, $$

depending on whether both row and column permutations are used.

Here:

Symbol Meaning
(P) row permutation
(Q) column permutation
(L) lower triangular factor
(U) upper triangular factor

Sparse LU must handle two competing goals:

Goal Reason
numerical stability requires pivoting
sparsity preservation requires good ordering

Pivoting may introduce fill-in. Avoiding fill-in too aggressively may harm stability.

95.18 Sparse Cholesky Factorization

If (A) is symmetric positive definite, sparse Cholesky factorization writes

$$ A=LL^T. $$

This is often more efficient and stable than sparse LU because no pivoting is usually required for positive definite matrices.

Sparse Cholesky is common in finite element methods, Gaussian Markov random fields, optimization, and graph-based computations.

Its efficiency depends strongly on ordering.

A poor ordering can produce substantial fill-in.

95.19 Ordering and Permutation

A permutation reorders rows and columns.

For symmetric matrices, one often forms

$$ PAP^T. $$

This changes the numbering of variables but not the underlying mathematical problem.

The purpose is to reduce fill-in and improve locality.

Common ordering strategies include:

Ordering Idea
Reverse Cuthill-McKee reduce bandwidth
Minimum degree eliminate low-degree vertices first
Approximate minimum degree efficient minimum-degree heuristic
Nested dissection split graph by separators

Ordering is a major part of sparse direct solver performance.

95.20 Bandwidth

The bandwidth of a matrix measures how far nonzero entries lie from the diagonal.

The upper bandwidth is the largest value of

$$ j-i $$

for a nonzero (a_{ij}).

The lower bandwidth is the largest value of

$$ i-j $$

for a nonzero (a_{ij}).

A tridiagonal matrix has lower and upper bandwidth (1).

Banded matrices are sparse matrices with nonzeros concentrated near the diagonal. Specialized band solvers can be much faster than general sparse solvers.

95.21 Sparse Triangular Solves

Sparse direct methods produce triangular systems such as

$$ Lx=b $$

or

$$ Ux=b. $$

Solving these systems requires following dependency structure.

If (L) is sparse lower triangular, then (x_i) depends only on previously computed variables (x_j) for which

$$ \ell_{ij}\ne 0 \qquad j<i. $$

Sparse triangular solves are less parallel than sparse matrix-vector multiplication because dependencies impose an order.

They are often performance bottlenecks in sparse direct methods and preconditioners.

95.22 Iterative Methods with Sparse Matrices

Sparse matrices are especially important for iterative methods.

A Krylov method typically needs:

Operation Sparse cost
Matrix-vector product (Ax) (O(\operatorname{nnz}(A)))
Vector updates (O(n))
Dot products (O(n))
Preconditioner application depends on preconditioner

If (\operatorname{nnz}(A)=O(n)), then each iteration may cost (O(n)).

This makes iterative methods suitable for very large sparse systems.

95.23 Sparse Preconditioners

Preconditioners are often sparse approximations to (A) or (A^{-1}).

Common sparse preconditioners include:

Preconditioner Description
Jacobi diagonal of (A)
block Jacobi independent diagonal blocks
incomplete LU approximate LU with controlled fill
incomplete Cholesky approximate Cholesky for SPD matrices
SSOR relaxation-based triangular approximation
algebraic multigrid hierarchy built from sparse structure

A good preconditioner reduces iteration count while remaining cheap to apply.

95.24 Incomplete Factorization

Incomplete factorization computes approximate factors while limiting fill-in.

For example, incomplete LU seeks

$$ A\approx LU, $$

but permits nonzeros only in selected positions.

The simplest form, ILU(0), keeps the original sparsity pattern.

More advanced variants allow controlled fill-in or drop small entries.

Incomplete Cholesky applies the same idea to symmetric positive definite matrices.

These methods are widely used as preconditioners rather than exact solvers.

95.25 Sparse Direct Versus Iterative Solvers

Sparse systems can be solved by direct or iterative methods.

Method type Strength Weakness
Sparse direct robust, accurate, good for multiple right-hand sides fill-in may be expensive
Iterative memory-light, matrix-vector based convergence depends on conditioning and preconditioning
Hybrid uses sparse factorization as preconditioner more complex

The best choice depends on problem size, matrix structure, accuracy requirements, and number of right-hand sides.

95.26 Matrix-Free Sparse Operators

Sometimes a sparse matrix is not stored explicitly.

Instead, the computation provides a function that applies the operator:

$$ x\mapsto Ax. $$

This matrix-free form is common when the matrix comes from a differential operator or a simulation.

Matrix-free methods save memory and may avoid assembly cost.

They are especially natural for Krylov methods, which only need matrix-vector products.

95.27 Block Sparse Matrices

Many sparse matrices have block structure.

For example, each node in a physical simulation may have several unknowns, such as velocity components or displacement coordinates.

A block sparse matrix stores small dense blocks instead of individual scalar entries.

Block sparse row format stores:

Object Meaning
dense blocks nonzero matrix blocks
block column indices column block positions
block row pointers start of each block row

Block storage reduces index overhead and can improve cache behavior.

95.28 Structural Symmetry

A matrix is structurally symmetric if

$$ a_{ij}\ne 0 $$

exactly when

$$ a_{ji}\ne 0. $$

This is weaker than numerical symmetry.

Numerical symmetry requires

$$ a_{ij}=a_{ji}. $$

Structural symmetry matters for graph algorithms and sparse storage.

A nonsymmetric matrix may still have a symmetric sparsity pattern.

95.29 Storage Tradeoffs

Sparse storage saves memory only when the savings from omitted zeros exceed the cost of indices.

For CSR with floating point values and integer column indices, each nonzero stores at least:

Field Typical size
value 8 bytes
column index 4 or 8 bytes

The row pointer stores (m+1) additional integers.

Thus sparse storage has overhead.

For small or moderately dense matrices, dense storage may be faster.

Sparse storage is most useful when the nonzero count is much smaller than the total number of entries.

95.30 Performance Considerations

Sparse computation is often limited by memory bandwidth.

Sparse matrix-vector multiplication performs few arithmetic operations per memory access. It must load values, indices, and vector entries.

Performance depends on:

Factor Effect
row length distribution load balance
column index locality cache behavior
format choice access pattern
ordering locality and fill-in
parallel partitioning communication cost
block structure better arithmetic intensity

Sparse algorithms require attention to data layout as much as mathematical operation count.

95.31 Parallel Sparse Computation

Sparse matrix-vector multiplication is often parallelized by assigning rows to processors or threads.

However, parallel sparse computation faces several difficulties:

Issue Explanation
irregular row lengths uneven work
indirect memory access poor cache locality
communication distributed vector entries needed
synchronization reductions and dependencies
triangular solves sequential dependencies

Graph partitioning is often used in distributed sparse computation to reduce communication.

95.32 Sparse Matrices in Graphs

For a graph with adjacency matrix (A),

$$ a_{ij}\ne 0 $$

means there is an edge from vertex (i) to vertex (j).

The graph Laplacian is

$$ L=D-A, $$

where (D) is the degree matrix.

Graph Laplacians are sparse when the graph has relatively few edges compared with (n^2).

They appear in spectral graph theory, network analysis, ranking, clustering, electrical networks, and numerical PDEs.

95.33 Sparse Matrices in PDEs

Discretized partial differential equations often produce sparse matrices.

For example, a one-dimensional second-difference operator gives a tridiagonal matrix.

A two-dimensional grid Laplacian gives a sparse matrix with a small stencil, often five nonzeros per row.

A three-dimensional grid Laplacian gives a seven-point stencil.

The number of nonzeros grows linearly with the number of grid points, while dense storage would grow quadratically.

95.34 Practical Checklist

When working with a sparse matrix, ask:

Question Why it matters
What is (\operatorname{nnz}(A))? determines storage and multiplication cost
Is the matrix symmetric? determines solver choices
Is it positive definite? enables Cholesky or CG
Is the pattern structured? enables special formats or orderings
Are many right-hand sides needed? may favor direct factorization
Is a good preconditioner available? controls iterative convergence
Is construction incremental? use COO, DOK, or LIL first
Is arithmetic repeated? convert to CSR or CSC

The format and algorithm should match the operation.

95.35 Summary

A sparse matrix stores mostly zero entries and is handled by algorithms that avoid unnecessary work with those zeros.

The central quantity is

$$ \operatorname{nnz}(A), $$

the number of nonzero entries.

Sparse storage formats include:

Format Main use
COO construction from triples
CSR row access and matrix-vector products
CSC column access and direct solvers
DOK, LIL incremental construction
DIA banded and diagonal structure
BSR block sparse structure

Sparse computation is governed by sparsity patterns, fill-in, ordering, and memory access. Sparse matrix-vector multiplication is the core operation in iterative methods. Sparse direct methods rely on factorization while controlling fill-in. Iterative methods rely on matrix-vector products and preconditioning.

Sparse matrices make large-scale linear algebra possible when dense storage and dense factorization are infeasible.