Merge Path Search
Find diagonal partition points that split a sorted merge into independent balanced ranges.
Merge Path Search
Merge path search finds partition points in two sorted arrays so that a merge can be split into independent chunks. It is mainly used for parallel merge, GPU merge, and distributed sorted data processing.
The method views merging as a path through a grid. Each step consumes one element from either the first array or the second array. A diagonal in this grid represents a fixed number of merged output elements.
Problem
Given two sorted arrays $A$ and $B$, find indices $i$ and $j$ such that:
$$ i + j = k $$
and the first $k$ elements of the merged output come from:
$$ A[0:i] \quad \text{and} \quad B[0:j] $$
The pair $(i, j)$ is the merge partition for output position $k$.
Idea
For a fixed output rank $k$, we need a split:
$$ j = k - i $$
The split is valid when:
$$ A[i - 1] \le B[j] $$
and
$$ B[j - 1] \le A[i] $$
Boundary cases treat missing values as negative or positive infinity.
Because increasing $i$ decreases $j$, the validity condition is monotone. We can binary search for the correct $i$.
Algorithm
merge_path_search(A, B, k):
low = max(0, k - length(B))
high = min(k, length(A))
while low <= high:
i = (low + high) // 2
j = k - i
if i > 0 and j < length(B) and A[i - 1] > B[j]:
high = i - 1
elif j > 0 and i < length(A) and B[j - 1] > A[i]:
low = i + 1
else:
return (i, j)
Example
Let:
$$ A = [2, 6, 9, 13] $$
$$ B = [1, 4, 8, 10, 15] $$
Find the partition for:
$$ k = 5 $$
The first five merged elements are:
$$ [1, 2, 4, 6, 8] $$
So the split is:
$$ i = 2,\quad j = 3 $$
because:
$$ A[0:2] = [2, 6] $$
and:
$$ B[0:3] = [1, 4, 8] $$
Check:
| condition | value |
|---|---|
| $A[i - 1] \le B[j]$ | $6 \le 10$ |
| $B[j - 1] \le A[i]$ | $8 \le 9$ |
Both hold, so $(2, 3)$ is valid.
Correctness
The algorithm searches over possible values of $i$. For each candidate, $j$ is fixed by $j = k - i$.
If $A[i - 1] > B[j]$, then too many elements were taken from $A$, so high moves left. If $B[j - 1] > A[i]$, then too few elements were taken from $A, so low` moves right.
When neither condition holds, all elements before the partition are less than or equal to all elements after the partition. Therefore the split gives exactly the first $k$ merged elements.
Complexity
| operation | time | ||||
|---|---|---|---|---|---|
| one partition search | $O(\log \min( | A | , | B | ))$ |
| $p$ partitions | $O(p \log \min( | A | , | B | ))$ |
Space:
$$ O(1) $$
For parallel merge, each worker gets an output range $[k_1, k_2)$ and computes two merge path searches to find its input ranges.
When to Use
Use merge path search when:
- two sorted arrays must be merged
- merge work should be split evenly
- parallel workers need independent ranges
- GPU threads need balanced merge partitions
- output positions matter directly
It is a core primitive for parallel merge sort and high throughput sorted joins.
Implementation
def merge_path_search(a, b, k):
low = max(0, k - len(b))
high = min(k, len(a))
while low <= high:
i = (low + high) // 2
j = k - i
if i > 0 and j < len(b) and a[i - 1] > b[j]:
high = i - 1
elif j > 0 and i < len(a) and b[j - 1] > a[i]:
low = i + 1
else:
return i, j
return low, k - low
func MergePathSearch(a []int, b []int, k int) (int, int) {
low := 0
if k-len(b) > low {
low = k - len(b)
}
high := k
if len(a) < high {
high = len(a)
}
for low <= high {
i := (low + high) / 2
j := k - i
if i > 0 && j < len(b) && a[i-1] > b[j] {
high = i - 1
} else if j > 0 && i < len(a) && b[j-1] > a[i] {
low = i + 1
} else {
return i, j
}
}
return low, k - low
}