B Plus Tree Search

Search in a B+ tree where all records are stored in leaves and internal nodes guide traversal.

B Plus Tree Search

B+ tree search operates on a multiway balanced tree where internal nodes store only routing keys, and all actual records reside in leaf nodes.

Leaves are linked in sorted order. This design improves range queries and sequential access while keeping search depth low.

Problem

Given a B+ tree root root and a target key x, locate the record associated with x.

If the key does not exist, return null.

Structure

Internal nodes contain sorted separator keys:

$$ k_0 < k_1 < \cdots < k_{m-1} $$

and child pointers:

$$ c_0, c_1, \ldots, c_m $$

Leaf nodes contain all data entries and are linked left to right.

The key ranges follow:

child key range
c0 keys smaller than k0
ci keys between k{i-1} and ki
cm keys greater than k{m-1}

Algorithm

Traverse internal nodes to locate the correct leaf. Then search inside the leaf.

bplus_tree_search(node, x):
    while node.is_leaf == false:
        i = upper_bound(node.keys, x)
        node = node.children[i]

    i = lower_bound(node.keys, x)

    if i < length(node.keys) and node.keys[i] == x:
        return node.values[i]

    return null

Example

            [20 | 40]
          /     |     \
     [5 10]  [20 25 30]  [40 50 60]

Search for 25:

step node action
1 [20, 40] move to middle child
2 leaf [20, 25, 30] find 25

Correctness

Internal nodes guide the search by partitioning key ranges. At each level, exactly one child can contain the target key.

All actual keys are stored in leaves. Once the correct leaf is reached, searching that node determines existence.

If the key is found, the returned value matches the target. If it is not found in the leaf, no other node can contain it.

Complexity

Let $B$ be the branching factor.

measure cost
height $O(\log_B n)$
node accesses $O(\log_B n)$

Search inside a node:

method cost
binary search $O(\log B)$
linear scan $O(B)$

Total time:

$$ O(\log n) $$

Space complexity:

version space
iterative $O(1)$
recursive $O(\log_B n)$

When to Use

B+ tree search is widely used in database systems and storage engines.

It is appropriate when:

  • range queries and scans are frequent
  • data resides on disk or SSD
  • high fanout reduces tree height
  • sequential leaf traversal is needed

Compared to B trees, B+ trees store all data in leaves and link leaves together, which improves range performance and cache locality.

Implementation

from bisect import bisect_left, bisect_right

class BPlusNode:
    def __init__(self, keys=None, children=None, values=None, is_leaf=True):
        self.keys = keys or []
        self.children = children or []
        self.values = values or []
        self.is_leaf = is_leaf

def bplus_tree_search(root, x):
    node = root

    while not node.is_leaf:
        i = bisect_right(node.keys, x)
        node = node.children[i]

    i = bisect_left(node.keys, x)

    if i < len(node.keys) and node.keys[i] == x:
        return node.values[i]

    return None
type BPlusNode struct {
	Keys     []int
	Children []*BPlusNode
	Values   []int
	IsLeaf   bool
}

func lowerBound(keys []int, x int) int {
	lo, hi := 0, len(keys)
	for lo < hi {
		mid := lo + (hi-lo)/2
		if keys[mid] < x {
			lo = mid + 1
		} else {
			hi = mid
		}
	}
	return lo
}

func upperBound(keys []int, x int) int {
	lo, hi := 0, len(keys)
	for lo < hi {
		mid := lo + (hi-lo)/2
		if keys[mid] <= x {
			lo = mid + 1
		} else {
			hi = mid
		}
	}
	return lo
}

func BPlusTreeSearch(root *BPlusNode, x int) (int, bool) {
	node := root

	for !node.IsLeaf {
		i := upperBound(node.Keys, x)
		node = node.Children[i]
	}

	i := lowerBound(node.Keys, x)

	if i < len(node.Keys) && node.Keys[i] == x {
		return node.Values[i], true
	}

	return 0, false
}