Least Significant Byte Sort

LSD radix sort variant that processes keys byte by byte from least significant to most significant.

Least Significant Byte Sort

Least Significant Byte sort is the byte oriented version of LSD radix sort. It processes fixed width keys one byte at a time, starting from the least significant byte and moving toward the most significant byte.

It relies on a stable inner sorting step, usually counting sort with 256 buckets.

Problem

Given an array $A$ of fixed width integer keys, sort the array in increasing order.

Each key can be written as:

$$ x = d_{m-1} d_{m-2} \cdots d_1 d_0 $$

where each $d_i$ is a byte:

$$ 0 \le d_i < 256 $$

Idea

Sort the array by byte positions in increasing order of significance:

$$ d_0, d_1, \ldots, d_{m-1} $$

After each pass, the array is sorted by the processed suffix of bytes.

Algorithm

lsb_sort(A, bytes):
    for p from 0 to bytes - 1:
        stable_counting_sort_by_byte(A, p)

    return A

Byte extraction:

$$ d_p = (x >> (8p)) \mathbin{&} 255 $$

Example

Sort:

$$ A = [513, 256, 258, 1] $$

Byte form:

value high byte low byte
513 2 1
256 1 0
258 1 2
1 0 1

After sorting by low byte:

$$ [256, 513, 1, 258] $$

After sorting by high byte:

$$ [1, 256, 258, 513] $$

Correctness

After processing byte $p$, the array is sorted by the suffix:

$$ d_p d_{p-1}\cdots d_0 $$

The base case holds after sorting by $d_0$. For the inductive step, stable sorting by $d_p$ orders elements by the new byte while preserving the ordering of lower bytes.

After all bytes have been processed, the array is sorted by the full key.

Complexity

Let:

  • $n$ be number of elements
  • $m$ be number of bytes

Each pass costs:

$$ O(n + 256) $$

Total time:

$$ O(m(n + 256)) $$

Space:

$$ O(n + 256) $$

For fixed width integers, $m$ is constant, so time is linear in $n$.

Properties

property value
stable yes
in place no
comparison based no
direction LSD
radix 256

When to Use

Use LSB sort when:

  • sorting large arrays of fixed width integers
  • stability is required
  • byte extraction is cheap
  • auxiliary memory is acceptable

Avoid when:

  • memory must be minimal
  • keys are variable length
  • recursive MSD methods perform better on the dataset

Implementation

def lsb_sort(a):
    if not a:
        return a

    max_bits = max(a).bit_length()
    bytes_count = (max_bits + 7) // 8

    for p in range(bytes_count):
        count = [0] * 256

        for x in a:
            b = (x >> (8 * p)) & 255
            count[b] += 1

        for i in range(1, 256):
            count[i] += count[i - 1]

        out = [0] * len(a)

        for i in range(len(a) - 1, -1, -1):
            x = a[i]
            b = (x >> (8 * p)) & 255
            count[b] -= 1
            out[count[b]] = x

        a = out

    return a