Multiway Merge Sort

Merge sort variant that merges more than two sorted sequences at each step.

Multiway Merge Sort

Multiway merge sort generalizes merge sort by splitting the input into $k$ parts instead of two, and merging $k$ sorted sequences at each step.

It is especially useful when merging cost dominates, such as in external memory or database systems.

Problem

Given an array $A$ of length $n$, reorder it such that:

$$ A[0] \le A[1] \le \dots \le A[n-1] $$

Idea

Instead of binary splitting:

$$ n \to \frac{n}{2} + \frac{n}{2} $$

multiway merge sort splits into $k$ parts:

$$ n \to \frac{n}{k} + \dots + \frac{n}{k} $$

Each part is sorted recursively, then merged using a $k$-way merge.

The merge step uses a priority queue or tournament tree to efficiently select the smallest element among the $k$ sequences.

Algorithm

multiway_merge_sort(A, k):
    if length(A) <= 1:
        return A

    split A into k parts

    for each part:
        recursively sort part

    return k_way_merge(parts)

Merge step:

k_way_merge(parts):
    create min heap

    insert first element of each part into heap

    result = []

    while heap not empty:
        extract minimum element
        append to result

        if that part has more elements:
            insert next element into heap

    return result

Example

Let:

$$ A = [9, 2, 7, 4, 6, 1, 8, 3, 5] $$

Split into $k = 3$ parts:

  • [9, 2, 7]
  • [4, 6, 1]
  • [8, 3, 5]

Sort each:

  • [2, 7, 9]
  • [1, 4, 6]
  • [3, 5, 8]

Merge:

$$ [1,2,3,4,5,6,7,8,9] $$

Correctness

Each recursive call sorts its subarray. The $k$-way merge always extracts the smallest remaining element among all parts, ensuring that the result is sorted. By induction on the recursion depth, the final array is sorted.

Complexity

metric value
recursion depth $O(\log_k n)$
merge cost $O(n \log k)$
total time $O(n \log n)$

Space:

$$ O(n) $$

Properties

property value
stable yes
in-place no
merge structure k-way
parallelism moderate

Notes

Multiway merge sort reduces recursion depth compared to binary merge sort, but increases merge complexity.

It is particularly effective in:

  • external memory sorting
  • database systems
  • distributed systems

In these settings, reducing the number of merge passes can significantly improve performance.

Implementation

import heapq

def multiway_merge_sort(a, k=2):
    if len(a) <= 1:
        return a

    size = len(a)
    parts = []
    step = (size + k - 1) // k

    for i in range(0, size, step):
        parts.append(multiway_merge_sort(a[i:i + step], k))

    return k_way_merge(parts)

def k_way_merge(parts):
    heap = []
    result = []

    for i, part in enumerate(parts):
        if part:
            heap.append((part[0], i, 0))

    heapq.heapify(heap)

    while heap:
        val, part_idx, elem_idx = heapq.heappop(heap)
        result.append(val)

        if elem_idx + 1 < len(parts[part_idx]):
            next_val = parts[part_idx][elem_idx + 1]
            heapq.heappush(heap, (next_val, part_idx, elem_idx + 1))

    return result