Natural Merge Sort

Natural merge sort improves merge sort by exploiting existing order in the input. Instead of splitting blindly, it scans the array to find already sorted runs, then merges those runs.

This approach adapts to partially sorted data and can reduce the number of passes.

Problem

Given an array $A$ of length $n$, reorder it such that:

$$ A[0] \le A[1] \le \dots \le A[n-1] $$

Idea

A run is a maximal nondecreasing subsequence:

$$ A[i] \le A[i+1] \le \dots \le A[j] $$

Natural merge sort first identifies such runs, then merges adjacent runs until only one remains.

Algorithm

Scan the array and partition it into runs
Merge adjacent runs pairwise
Repeat until one run remains

natural_merge_sort(A):
    runs = []

    i = 0
    while i < n:
        j = i
        while j + 1 < n and A[j] <= A[j + 1]:
            j += 1
        runs.append((i, j))
        i = j + 1

    while length(runs) > 1:
        new_runs = []
        for k from 0 to length(runs) step 2:
            if k + 1 < length(runs):
                merged = merge(A, runs[k], runs[k+1])
                new_runs.append(merged)
            else:
                new_runs.append(runs[k])
        runs = new_runs

Example

Let:

$$ A = [1, 3, 5, 2, 4, 6] $$

Detected runs:

[1, 3, 5]
[2, 4, 6]

Merge:

[1, 3, 5] + [2, 4, 6] → [1, 2, 3, 4, 5, 6]

Only one pass is needed.

Correctness

Each run is already sorted. Merging two sorted runs produces a sorted result. Repeated merging eventually produces a single sorted run covering the entire array.

Complexity

Worst case occurs when the input has no structure:

$$ T(n) = O(n \log n) $$

Best case occurs when the array is already sorted:

$$ T(n) = O(n) $$

case	time
best	$O(n)$
worst	$O(n \log n)$

Space:

$$ O(n) $$

Properties

property	value
stable	yes
adaptive	yes
in-place	no
practical use	foundation of Timsort

Notes

Natural merge sort benefits from real world data, which often contains sorted segments. It reduces unnecessary work compared to fixed splitting strategies.

Modern algorithms such as Timsort extend this idea by using more sophisticated run detection and merging strategies.

Implementation

def natural_merge_sort(a):
    n = len(a)
    temp = [0] * n

    def merge(l, mid, r):
        i, j, k = l, mid + 1, l
        while i <= mid and j <= r:
            if a[i] <= a[j]:
                temp[k] = a[i]
                i += 1
            else:
                temp[k] = a[j]
                j += 1
            k += 1
        while i <= mid:
            temp[k] = a[i]
            i += 1
            k += 1
        while j <= r:
            temp[k] = a[j]
            j += 1
            k += 1
        for i in range(l, r + 1):
            a[i] = temp[i]

    while True:
        runs = []
        i = 0
        while i < n:
            j = i
            while j + 1 < n and a[j] <= a[j + 1]:
                j += 1
            runs.append((i, j))
            i = j + 1

        if len(runs) == 1:
            break

        for k in range(0, len(runs), 2):
            if k + 1 < len(runs):
                l1, r1 = runs[k]
                l2, r2 = runs[k + 1]
                merge(l1, r1, r2)

    return a

func NaturalMergeSort(a []int) {
	n := len(a)
	temp := make([]int, n)

	var merge func(int, int, int)
	merge = func(l, mid, r int) {
		i, j, k := l, mid+1, l

		for i <= mid && j <= r {
			if a[i] <= a[j] {
				temp[k] = a[i]
				i++
			} else {
				temp[k] = a[j]
				j++
			}
			k++
		}

		for i <= mid {
			temp[k] = a[i]
			i++
			k++
		}

		for j <= r {
			temp[k] = a[j]
			j++
			k++
		}

		for i := l; i <= r; i++ {
			a[i] = temp[i]
		}
	}

	for {
		runs := [][2]int{}
		i := 0

		for i < n {
			j := i
			for j+1 < n && a[j] <= a[j+1] {
				j++
			}
			runs = append(runs, [2]int{i, j})
			i = j + 1
		}

		if len(runs) == 1 {
			break
		}

		for k := 0; k < len(runs); k += 2 {
			if k+1 < len(runs) {
				l1, r1 := runs[k][0], runs[k][1]
				l2, r2 := runs[k+1][0], runs[k+1][1]
				merge(l1, r1, r2)
			}
		}
	}
}