Merge Path Search

Merge path search finds partition points in two sorted arrays so that a merge can be split into independent chunks. It is mainly used for parallel merge, GPU merge, and distributed sorted data processing.

The method views merging as a path through a grid. Each step consumes one element from either the first array or the second array. A diagonal in this grid represents a fixed number of merged output elements.

Problem

Given two sorted arrays $A$ and $B$, find indices $i$ and $j$ such that:

$$ i + j = k $$

and the first $k$ elements of the merged output come from:

$$ A[0:i] \quad \text{and} \quad B[0:j] $$

The pair $(i, j)$ is the merge partition for output position $k$.

Idea

For a fixed output rank $k$, we need a split:

$$ j = k - i $$

The split is valid when:

$$ A[i - 1] \le B[j] $$

and

$$ B[j - 1] \le A[i] $$

Boundary cases treat missing values as negative or positive infinity.

Because increasing $i$ decreases $j$, the validity condition is monotone. We can binary search for the correct $i$.

Algorithm

merge_path_search(A, B, k):
    low = max(0, k - length(B))
    high = min(k, length(A))

    while low <= high:
        i = (low + high) // 2
        j = k - i

        if i > 0 and j < length(B) and A[i - 1] > B[j]:
            high = i - 1
        elif j > 0 and i < length(A) and B[j - 1] > A[i]:
            low = i + 1
        else:
            return (i, j)

Example

Let:

$$ A = [2, 6, 9, 13] $$

$$ B = [1, 4, 8, 10, 15] $$

Find the partition for:

$$ k = 5 $$

The first five merged elements are:

$$ [1, 2, 4, 6, 8] $$

So the split is:

$$ i = 2,\quad j = 3 $$

because:

$$ A[0:2] = [2, 6] $$

and:

$$ B[0:3] = [1, 4, 8] $$

Check:

condition	value
$A[i - 1] \le B[j]$	$6 \le 10$
$B[j - 1] \le A[i]$	$8 \le 9$

Both hold, so $(2, 3)$ is valid.

Correctness

The algorithm searches over possible values of $i$. For each candidate, $j$ is fixed by $j = k - i$.

If $A[i - 1] > B[j]$, then too many elements were taken from $A$, so high moves left. If $B[j - 1] > A[i]$, then too few elements were taken from $A, so low` moves right.

When neither condition holds, all elements before the partition are less than or equal to all elements after the partition. Therefore the split gives exactly the first $k$ merged elements.

Complexity

operation	time
one partition search	$O(\log \min(	A	,	B	))$
$p$ partitions	$O(p \log \min(	A	,	B	))$

Space:

$$ O(1) $$

For parallel merge, each worker gets an output range $[k_1, k_2)$ and computes two merge path searches to find its input ranges.

When to Use

Use merge path search when:

two sorted arrays must be merged
merge work should be split evenly
parallel workers need independent ranges
GPU threads need balanced merge partitions
output positions matter directly

It is a core primitive for parallel merge sort and high throughput sorted joins.

Implementation

def merge_path_search(a, b, k):
    low = max(0, k - len(b))
    high = min(k, len(a))

    while low <= high:
        i = (low + high) // 2
        j = k - i

        if i > 0 and j < len(b) and a[i - 1] > b[j]:
            high = i - 1
        elif j > 0 and i < len(a) and b[j - 1] > a[i]:
            low = i + 1
        else:
            return i, j

    return low, k - low

func MergePathSearch(a []int, b []int, k int) (int, int) {
    low := 0
    if k-len(b) > low {
        low = k - len(b)
    }

    high := k
    if len(a) < high {
        high = len(a)
    }

    for low <= high {
        i := (low + high) / 2
        j := k - i

        if i > 0 && j < len(b) && a[i-1] > b[j] {
            high = i - 1
        } else if j > 0 && i < len(a) && b[j-1] > a[i] {
            low = i + 1
        } else {
            return i, j
        }
    }

    return low, k - low
}