Merge Sort

Merge sort is a divide and conquer sorting algorithm. It splits the input into smaller parts, sorts each part recursively, then merges the sorted parts into a single sorted sequence.

It guarantees stable ordering and predictable time complexity. It performs well on large datasets and external memory systems.

Problem

Given an array $A$ of length $n$, reorder it such that:

$$ A[0] \le A[1] \le \dots \le A[n-1] $$

Algorithm

Divide the array into two halves. Sort each half. Merge the two sorted halves.

merge_sort(A):
    if length(A) <= 1:
        return A

    mid = length(A) // 2
    left = merge_sort(A[0:mid])
    right = merge_sort(A[mid:n])

    return merge(left, right)

The merge step combines two sorted arrays into one sorted array.

merge(left, right):
    i = 0
    j = 0
    result = []

    while i < length(left) and j < length(right):
        if left[i] <= right[j]:
            append left[i] to result
            i += 1
        else:
            append right[j] to result
            j += 1

    append remaining elements of left
    append remaining elements of right

    return result

Example

Let:

$$ A = [5, 2, 4, 1] $$

Split:

left: [5, 2]
right: [4, 1]

Sort recursively:

[5, 2] → [2, 5]
[4, 1] → [1, 4]

Merge:

step	left	right	result
1	2	1	[1]
2	2	4	[1, 2]
3	5	4	[1, 2, 4]
4	5	-	[1, 2, 4, 5]

Final result:

$$ [1, 2, 4, 5] $$

Correctness

The recursion ensures each subarray is sorted. The merge step preserves sorted order because it always selects the smallest available element from the two inputs. By induction on subarray size, the final output is sorted.

Complexity

metric	value
time (best)	$O(n \log n)$
time (worst)	$O(n \log n)$
time (average)	$O(n \log n)$

Recurrence:

$$ T(n) = 2T(n/2) + O(n) $$

T(n)=2T(n/2)+n

Solution:

$$ T(n) = O(n \log n) $$

Space:

$$ O(n) $$

for the auxiliary merge array.

Properties

property	value
stable	yes
in-place	no
adaptive	no
parallelizable	yes

When to Use

Use merge sort when:

stable sorting is required
worst case guarantees matter
data is large or stored externally
parallel execution is available

Avoid it when memory is constrained, since it requires additional space proportional to input size.

Implementation

def merge_sort(a):
    if len(a) <= 1:
        return a

    mid = len(a) // 2
    left = merge_sort(a[:mid])
    right = merge_sort(a[mid:])

    return merge(left, right)

def merge(left, right):
    i = j = 0
    result = []

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result.extend(left[i:])
    result.extend(right[j:])
    return result

func MergeSort(a []int) []int {
	if len(a) <= 1 {
		return a
	}

	mid := len(a) / 2
	left := MergeSort(a[:mid])
	right := MergeSort(a[mid:])

	return merge(left, right)
}

func merge(left, right []int) []int {
	i, j := 0, 0
	result := make([]int, 0, len(left)+len(right))

	for i < len(left) && j < len(right) {
		if left[i] <= right[j] {
			result = append(result, left[i])
			i++
		} else {
			result = append(result, right[j])
			j++
		}
	}

	result = append(result, left[i:]...)
	result = append(result, right[j:]...)
	return result
}