LSM Tree Compaction Sort

Sorting through hierarchical merging in Log Structured Merge Trees using compaction across levels.

LSM Tree Compaction Sort

LSM tree compaction sort organizes sorting as a sequence of merges across levels in a Log Structured Merge (LSM) tree. Data is inserted in batches, stored as sorted runs, and periodically merged into larger sorted structures.

Sorting emerges as a byproduct of compaction. Each level maintains sorted order, and merges ensure global ordering across levels.

Model

Assume:

  • data arrives in batches
  • memory buffer of size $M$
  • disk organized into levels $L_0, L_1, \ldots$
  • each level stores sorted runs

Each level is larger than the previous by a factor $T$ (size ratio).

Core Idea

  1. Accumulate records in memory
  2. Sort and flush as a run to disk
  3. Merge runs across levels when a level exceeds capacity
  4. Continue until data stabilizes

Each compaction step merges sorted runs into a larger sorted run.

Algorithm

lsm_sort(stream):
    mem = empty buffer
    levels = []

    for record in stream:
        insert into mem

        if mem is full:
            run = sort(mem)
            add_to_level(levels, 0, run)
            clear mem

    flush remaining mem
    compact all levels

    return merge_all_levels(levels)

Compaction:

add_to_level(levels, i, run):
    ensure level i exists

    append run to levels[i]

    if size(levels[i]) exceeds threshold:
        merged = merge(levels[i])
        clear levels[i]
        add_to_level(levels, i + 1, merged)

Example

Suppose memory holds 3 elements and input is:

$$ [8, 3, 5, 1, 9, 2, 7] $$

Step 1: Flush runs

  • [8, 3, 5] → [3, 5, 8] → level 0
  • [1, 9, 2] → [1, 2, 9] → level 0
  • [7] → [7] → level 0

Step 2: Compaction

Level 0 has multiple runs:

merge:

$$ [3,5,8], [1,2,9], [7] \rightarrow [1,2,3,5,7,8,9] $$

store at level 1

Correctness

Each flushed run is sorted. Compaction merges sorted runs using a standard merge procedure that preserves order. Each level stores data in sorted form.

When all levels are merged or queried with a merge procedure, the resulting sequence is globally sorted.

No elements are lost or duplicated because compaction only merges existing runs.

Complexity

Let:

  • $N$ be total elements
  • $B$ be block size
  • $T$ be size ratio between levels

I/O cost:

$$ O\left(\frac{N}{B} \log_T \frac{N}{M}\right) $$

Each element may be rewritten multiple times during compaction.

CPU cost:

$$ O(N \log k) $$

per merge, where $k$ is number of runs merged.

Tradeoffs

parameter effect
larger $T$ fewer levels, more expensive compactions
smaller $T$ more levels, cheaper compactions
aggressive compaction better read performance
lazy compaction better write throughput

When to Use

LSM compaction sort is appropriate when:

  • data arrives continuously
  • write throughput is more important than immediate sorting
  • storage is append-friendly
  • eventual sorted order is acceptable
  • system resembles a database or log store

It is widely used in key-value stores and storage engines.

Comparison

method behavior
external merge sort batch sorting
buffer tree sort batched insertion
LSM compaction continuous merge and sort

Implementation Sketch

import heapq

def merge_runs(runs):
    heap = []
    iters = [iter(r) for r in runs]

    for i, it in enumerate(iters):
        try:
            heapq.heappush(heap, (next(it), i))
        except StopIteration:
            pass

    result = []

    while heap:
        val, i = heapq.heappop(heap)
        result.append(val)
        try:
            heapq.heappush(heap, (next(iters[i]), i))
        except StopIteration:
            pass

    return result
def lsm_sort(data, mem_size):
    levels = []
    mem = []

    def add(level, run):
        while len(levels) <= level:
            levels.append([])
        levels[level].append(run)

        if len(levels[level]) > 2:
            merged = merge_runs(levels[level])
            levels[level] = []
            add(level + 1, merged)

    for x in data:
        mem.append(x)
        if len(mem) >= mem_size:
            add(0, sorted(mem))
            mem = []

    if mem:
        add(0, sorted(mem))

    return merge_runs([run for level in levels for run in level])

Notes

LSM tree compaction reframes sorting as a continuous process. Instead of sorting once, data is gradually organized through merges. This approach trades extra write cost for high ingestion throughput and scalability.