Merge Sort

Divide and conquer sorting algorithm that splits the array, sorts recursively, and merges in linear time.

Merge Sort

Merge sort is a divide and conquer sorting algorithm. It splits the input into smaller parts, sorts each part recursively, then merges the sorted parts into a single sorted sequence.

It guarantees stable ordering and predictable time complexity. It performs well on large datasets and external memory systems.

Problem

Given an array $A$ of length $n$, reorder it such that:

$$ A[0] \le A[1] \le \dots \le A[n-1] $$

Algorithm

Divide the array into two halves. Sort each half. Merge the two sorted halves.

merge_sort(A):
    if length(A) <= 1:
        return A

    mid = length(A) // 2
    left = merge_sort(A[0:mid])
    right = merge_sort(A[mid:n])

    return merge(left, right)

The merge step combines two sorted arrays into one sorted array.

merge(left, right):
    i = 0
    j = 0
    result = []

    while i < length(left) and j < length(right):
        if left[i] <= right[j]:
            append left[i] to result
            i += 1
        else:
            append right[j] to result
            j += 1

    append remaining elements of left
    append remaining elements of right

    return result

Example

Let:

$$ A = [5, 2, 4, 1] $$

Split:

  • left: [5, 2]
  • right: [4, 1]

Sort recursively:

  • [5, 2] → [2, 5]
  • [4, 1] → [1, 4]

Merge:

step left right result
1 2 1 [1]
2 2 4 [1, 2]
3 5 4 [1, 2, 4]
4 5 - [1, 2, 4, 5]

Final result:

$$ [1, 2, 4, 5] $$

Correctness

The recursion ensures each subarray is sorted. The merge step preserves sorted order because it always selects the smallest available element from the two inputs. By induction on subarray size, the final output is sorted.

Complexity

metric value
time (best) $O(n \log n)$
time (worst) $O(n \log n)$
time (average) $O(n \log n)$

Recurrence:

$$ T(n) = 2T(n/2) + O(n) $$

T(n)=2T(n/2)+n

Solution:

$$ T(n) = O(n \log n) $$

Space:

$$ O(n) $$

for the auxiliary merge array.

Properties

property value
stable yes
in-place no
adaptive no
parallelizable yes

When to Use

Use merge sort when:

  • stable sorting is required
  • worst case guarantees matter
  • data is large or stored externally
  • parallel execution is available

Avoid it when memory is constrained, since it requires additional space proportional to input size.

Implementation

def merge_sort(a):
    if len(a) <= 1:
        return a

    mid = len(a) // 2
    left = merge_sort(a[:mid])
    right = merge_sort(a[mid:])

    return merge(left, right)

def merge(left, right):
    i = j = 0
    result = []

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result.extend(left[i:])
    result.extend(right[j:])
    return result
func MergeSort(a []int) []int {
	if len(a) <= 1 {
		return a
	}

	mid := len(a) / 2
	left := MergeSort(a[:mid])
	right := MergeSort(a[mid:])

	return merge(left, right)
}

func merge(left, right []int) []int {
	i, j := 0, 0
	result := make([]int, 0, len(left)+len(right))

	for i < len(left) && j < len(right) {
		if left[i] <= right[j] {
			result = append(result, left[i])
			i++
		} else {
			result = append(result, right[j])
			j++
		}
	}

	result = append(result, left[i:]...)
	result = append(result, right[j:]...)
	return result
}