Array Memory Layout
Understand how arrays are stored in memory and how layout affects performance.
Array Memory Layout
Array memory layout describes how elements are arranged in physical memory. The layout determines how efficiently the CPU can access data due to cache behavior and alignment.
You use this knowledge to write code that is fast in practice, not only in asymptotic terms.
Problem
Given an array $A$ of length $n$, understand how its storage layout affects:
- access time
- cache utilization
- traversal performance
Structure
A standard array uses contiguous memory:
$$ A = [a_0, a_1, a_2, \dots, a_{n-1}] $$
Each element occupies a fixed number of bytes. The address of element $i$ is:
$$ address(A[i]) = base(A) + i \cdot size(element) $$
Layout Patterns
1. Contiguous layout
Elements are stored sequentially.
- excellent cache locality
- predictable access
2. Row-major layout (for 2D arrays)
Rows are stored consecutively:
$$ offset = i \cdot columns + j $$
3. Column-major layout
Columns are stored consecutively:
$$ offset = j \cdot rows + i $$
4. Strided access
Access pattern skips elements:
$$ A[0], A[2], A[4], \dots $$
This reduces cache efficiency.
Algorithm
Cache-friendly traversal:
scan(A):
for i from 0 to n - 1:
process A[i]
Cache-unfriendly traversal (example for 2D):
scan(matrix):
for j from 0 to columns - 1:
for i from 0 to rows - 1:
process matrix[i][j]
Example
Consider a 2D array:
$$
A =
\begin{bmatrix}
8 & 3 & 5
1 & 9 & 4
\end{bmatrix}
$$
Row-major storage:
$$ [8, 3, 5, 1, 9, 4] $$
Row-wise traversal:
| step | value |
|---|---|
| 1 | 8 |
| 2 | 3 |
| 3 | 5 |
| 4 | 1 |
| 5 | 9 |
| 6 | 4 |
Column-wise traversal jumps across memory, causing more cache misses.
Correctness
The mapping from index to memory location follows a deterministic formula. As long as indexing respects this formula, every element is accessed correctly.
Performance differences arise from how memory is accessed, not from correctness.
Complexity
| operation | time |
|---|---|
| indexed access | $O(1)$ |
| sequential scan | $O(n)$ |
However, constant factors vary due to cache behavior.
When to Use
Memory layout considerations are important when:
- working with large arrays
- performance is critical
- cache locality affects runtime
- processing matrices or tensors
Guidelines:
- iterate in memory order
- avoid large strides
- group related data together
- prefer contiguous storage
Implementation
def row_major_scan(matrix):
for row in matrix:
for value in row:
process(value)
func RowMajorScan(matrix [][]int) {
for i := 0; i < len(matrix); i++ {
for j := 0; j < len(matrix[i]); j++ {
process(matrix[i][j])
}
}
}
func process(x int) {
// placeholder
}