Coordinate Compression Sort

Sort values by mapping them to a dense rank space and ordering by compressed indices.

Coordinate Compression Sort

Coordinate compression sort maps arbitrary values to a dense integer range based on their sorted order, then sorts using these compact indices. It is useful when values are large, sparse, or not directly suitable for counting or radix methods.

Problem

Given an array $A$ of values, possibly large integers or non-integer comparable items, sort the array efficiently by reducing the value domain.

Idea

Replace each value with its rank in the sorted order of distinct values.

Let:

$$ V = \text{sorted unique values of } A $$

Define mapping:

$$ \text{rank}(x) = \text{index of } x \text{ in } V $$

Then:

  1. map each value to its rank
  2. sort by ranks
  3. optionally map back to original values

Algorithm

coordinate_compression_sort(A):
    V = sorted unique values of A

    create map from value to index

    B = array of size n

    for i from 0 to n-1:
        B[i] = map[A[i]]

    sort B

    for i from 0 to n-1:
        A[i] = V[B[i]]

    return A

Example

Let:

$$ A = [1000, 5000, 1000, 2000] $$

Unique sorted values:

$$ V = [1000, 2000, 5000] $$

Mapping:

value rank
1000 0
2000 1
5000 2

Mapped array:

$$ [0, 2, 0, 1] $$

Sorted:

$$ [0, 0, 1, 2] $$

Mapped back:

$$ [1000, 1000, 2000, 5000] $$

Correctness

The mapping preserves order:

$$ x_1 < x_2 \iff \text{rank}(x_1) < \text{rank}(x_2) $$

Thus sorting compressed values yields the correct order of original values.

Complexity

Let:

  • $n$ be number of elements
  • $u$ be number of distinct values

Time:

  • building $V$: $O(n \log n)$
  • mapping: $O(n)$
  • sorting compressed array: $O(n \log n)$

Total:

$$ O(n \log n) $$

If combined with counting or radix sort after compression:

$$ O(n + u) $$

Space:

$$ O(n + u) $$

Properties

property value
stable depends on underlying sort
in place no
comparison based partly
handles large values yes

When to Use

Use coordinate compression when:

  • values are large or sparse
  • you want to apply counting or radix sort afterward
  • relative ordering matters but exact values are not needed during sorting

Avoid when:

  • values are already small integers
  • direct sorting is simpler

Implementation

def coordinate_compression_sort(a):
    if not a:
        return a

    vals = sorted(set(a))
    rank = {v: i for i, v in enumerate(vals)}

    comp = [rank[x] for x in a]
    comp.sort()

    return [vals[i] for i in comp]