Kth Smallest Pair Distance
Find the k-th smallest absolute difference among all pairs in an array.
Kth Smallest Pair Distance
Kth Smallest Pair Distance finds the k-th smallest value among all absolute pair differences.
For an array $A$, each pair $(i, j)$ with $i < j$ has distance:
$$ |A[i] - A[j]| $$
Instead of generating all $O(n^2)$ distances, sort the array and binary search over possible distance values.
Problem
Given an array $A$ of length $n$ and a 1-based rank $k$, find the k-th smallest pair distance.
There are:
$$ \frac{n(n - 1)}{2} $$
pairs.
Algorithm
Sort the array. Then binary search the answer distance $d$. For each candidate $d$, count how many pairs have distance at most $d$.
kth_smallest_pair_distance(A, k):
sort A
low = 0
high = A[n - 1] - A[0]
while low < high:
mid = floor((low + high) / 2)
if count_pairs_at_most(A, mid) < k:
low = mid + 1
else:
high = mid
return low
The counting step uses two pointers.
count_pairs_at_most(A, d):
count = 0
left = 0
for right from 0 to n - 1:
while A[right] - A[left] > d:
left = left + 1
count = count + (right - left)
return count
Example
Let:
$$ A = [1, 6, 1] $$
After sorting:
$$ [1, 1, 6] $$
Pair distances are:
$$ 0, 5, 5 $$
For $k = 1$, the answer is:
$$ 0 $$
Correctness
After sorting, the absolute distance for a pair $(i, j)$ with $i < j$ is $A[j] - A[i]$.
For a fixed right endpoint, the two pointer loop finds the smallest left index such that all indices from left to right - 1 form pairs with distance at most $d$. Therefore it adds exactly the number of valid pairs ending at right.
The predicate "number of pairs with distance at most $d$ is at least $k$" is monotone. If it holds for a distance $d$, it also holds for every larger distance. Binary search returns the smallest distance for which the predicate holds, which is exactly the k-th smallest pair distance.
Complexity
| step | cost |
|---|---|
| sorting | $O(n \log n)$ |
| one count pass | $O(n)$ |
| binary search over distance range | $O(\log R)$ |
| total | $O(n \log n + n \log R)$ |
| space | $O(1)$ extra, aside from sorting |
Here:
$$ R = \max(A) - \min(A) $$
When to Use
Use this algorithm when all pair distances are too many to materialize, but the values are numeric and sortable.
It is especially useful when $n$ is large enough that $O(n^2)$ distance generation is infeasible.
Implementation
def count_pairs_at_most(a, d):
count = 0
left = 0
for right in range(len(a)):
while a[right] - a[left] > d:
left += 1
count += right - left
return count
def kth_smallest_pair_distance(a, k):
a.sort()
low = 0
high = a[-1] - a[0]
while low < high:
mid = (low + high) // 2
if count_pairs_at_most(a, mid) < k:
low = mid + 1
else:
high = mid
return low
func countPairsAtMost(a []int, d int) int {
count := 0
left := 0
for right := 0; right < len(a); right++ {
for a[right]-a[left] > d {
left++
}
count += right - left
}
return count
}
func KthSmallestPairDistance(a []int, k int) int {
sortInts(a)
low := 0
high := a[len(a)-1] - a[0]
for low < high {
mid := low + (high-low)/2
if countPairsAtMost(a, mid) < k {
low = mid + 1
} else {
high = mid
}
}
return low
}