Parallel Binary Search with DSU
Answer offline connectivity queries by combining parallel binary search with a disjoint set union.
Parallel Binary Search with DSU
Parallel binary search with DSU answers many offline connectivity queries over a sequence of edge insertions. It finds the earliest time when each pair of vertices becomes connected.
The technique combines two ideas:
- parallel binary search over answer times
- disjoint set union for fast connectivity checks
It applies when edges are only added. Once two vertices become connected, they remain connected, so the predicate is monotone.
Problem
Given:
- vertices $0$ to $n - 1$
- edge insertions $E_0, E_1, \dots, E_{T-1}$
- queries $(u, v)$
For each query, find the smallest time $t$ such that $u$ and $v$ are connected after applying edges:
$$ E_0, E_1, \dots, E_t $$
If they never become connected, return $-1$.
Key Idea
Each query has an answer interval $[l, r]$ over time.
During each round:
- Put each active query into a bucket by midpoint
- Reset the DSU
- Replay edge insertions from left to right
- At each time $t$, answer all queries whose midpoint is $t$
- Shrink each query interval
The DSU is rebuilt once per round, not once per query.
Algorithm
parallel_binary_search_dsu(n, edges, queries):
T = length(edges)
Q = length(queries)
low[q] = 0 for every query q
high[q] = T for every query q
while some query has low[q] < high[q]:
buckets = array of empty lists of length T
for q from 0 to Q - 1:
if low[q] < high[q]:
mid = (low[q] + high[q]) // 2
if mid < T:
buckets[mid].append(q)
dsu.reset(n)
for t from 0 to T - 1:
dsu.union(edges[t].u, edges[t].v)
for q in buckets[t]:
u, v = queries[q]
if dsu.find(u) == dsu.find(v):
high[q] = t
else:
low[q] = t + 1
answer[q] = low[q] if low[q] < T else -1
return answer
Example
Edges are inserted over time:
| time | edge |
|---|---|
| 0 | $(0, 1)$ |
| 1 | $(2, 3)$ |
| 2 | $(1, 2)$ |
| 3 | $(4, 5)$ |
Query:
$$ (0, 3) $$
Connectivity changes:
| time | connected? |
|---|---|
| 0 | no |
| 1 | no |
| 2 | yes |
| 3 | yes |
The earliest connected time is $2$.
Correctness
For each query $(u, v)$, define:
$$ F(t) = \text{$u$ and $v$ are connected after edges } 0..t $$
Because edges are only added, connectivity can change only from false to true. It never changes from true back to false.
Therefore $F(t)$ is monotone.
Each query maintains an interval containing its earliest true time. If the DSU says the endpoints are connected at midpoint $m$, then the earliest true time is at most $m$, so the algorithm moves the right boundary to $m$.
If they are not connected at $m$, monotonicity implies they are not connected at any earlier time, so the algorithm moves the left boundary to $m + 1$.
When the interval collapses, the remaining time is the earliest connection time, or $T$ if no such time exists.
Complexity
Let:
- $n$ be the number of vertices
- $T$ be the number of edge insertions
- $Q$ be the number of queries
- $\alpha(n)$ be the inverse Ackermann factor from DSU
Each round replays all edges and checks all active queries.
| component | cost |
|---|---|
| rounds | $O(\log T)$ |
| DSU unions | $O(T \alpha(n) \log T)$ |
| query checks | $O(Q \alpha(n) \log T)$ |
| total | $O((T + Q)\alpha(n)\log T)$ |
| space | $O(n + T + Q)$ |
DSU Operations
make_set(v):
parent[v] = v
size[v] = 1
find(v):
while parent[v] != v:
parent[v] = parent[parent[v]]
v = parent[v]
return v
union(a, b):
ra = find(a)
rb = find(b)
if ra == rb:
return
if size[ra] < size[rb]:
swap(ra, rb)
parent[rb] = ra
size[ra] += size[rb]
When to Use
Use this method when:
- all queries are known beforehand
- updates are edge insertions only
- each query asks for earliest connectivity
- the graph grows monotonically
- rebuilding DSU per round is acceptable
Notes
- This method does not directly support edge deletions.
- For deletions, use rollback DSU with divide and conquer on time.
- If only final connectivity matters, ordinary DSU is enough.
- If queries arrive online, this offline method is not suitable.
Implementation
class DSU:
def __init__(self, n):
self.parent = list(range(n))
self.size = [1] * n
def find(self, x):
while self.parent[x] != x:
self.parent[x] = self.parent[self.parent[x]]
x = self.parent[x]
return x
def union(self, a, b):
ra = self.find(a)
rb = self.find(b)
if ra == rb:
return
if self.size[ra] < self.size[rb]:
ra, rb = rb, ra
self.parent[rb] = ra
self.size[ra] += self.size[rb]
def parallel_binary_search_dsu(n, edges, queries):
t = len(edges)
q = len(queries)
low = [0] * q
high = [t] * q
while True:
active = False
buckets = [[] for _ in range(t)]
for i in range(q):
if low[i] < high[i]:
active = True
mid = (low[i] + high[i]) // 2
if mid < t:
buckets[mid].append(i)
if not active:
break
dsu = DSU(n)
for time, (a, b) in enumerate(edges):
dsu.union(a, b)
for qi in buckets[time]:
u, v = queries[qi]
if dsu.find(u) == dsu.find(v):
high[qi] = time
else:
low[qi] = time + 1
return [low[i] if low[i] < t else -1 for i in range(q)]
type DSU struct {
parent []int
size []int
}
func NewDSU(n int) *DSU {
parent := make([]int, n)
size := make([]int, n)
for i := 0; i < n; i++ {
parent[i] = i
size[i] = 1
}
return &DSU{parent: parent, size: size}
}
func (d *DSU) Find(x int) int {
for d.parent[x] != x {
d.parent[x] = d.parent[d.parent[x]]
x = d.parent[x]
}
return x
}
func (d *DSU) Union(a, b int) {
ra := d.Find(a)
rb := d.Find(b)
if ra == rb {
return
}
if d.size[ra] < d.size[rb] {
ra, rb = rb, ra
}
d.parent[rb] = ra
d.size[ra] += d.size[rb]
}
type Edge struct {
U int
V int
}
type Query struct {
U int
V int
}
func ParallelBinarySearchDSU(n int, edges []Edge, queries []Query) []int {
t := len(edges)
q := len(queries)
low := make([]int, q)
high := make([]int, q)
for i := range high {
high[i] = t
}
for {
active := false
buckets := make([][]int, t)
for i := 0; i < q; i++ {
if low[i] < high[i] {
active = true
mid := (low[i] + high[i]) / 2
if mid < t {
buckets[mid] = append(buckets[mid], i)
}
}
}
if !active {
break
}
dsu := NewDSU(n)
for time, e := range edges {
dsu.Union(e.U, e.V)
for _, qi := range buckets[time] {
query := queries[qi]
if dsu.Find(query.U) == dsu.Find(query.V) {
high[qi] = time
} else {
low[qi] = time + 1
}
}
}
}
ans := make([]int, q)
for i := 0; i < q; i++ {
if low[i] < t {
ans[i] = low[i]
} else {
ans[i] = -1
}
}
return ans
}