#searching
Exponential Backoff Search
Exponential Backoff Search Exponential backoff search finds a target in a sorted or monotone structure by expanding the search interval geometrically. Instead of scanning linearly, it probes positions at increasing distances until it either finds the target or bounds it inside an interval. A secondary search then refines the result. This method is useful when the size of the data is unknown, unbounded, or expensive to traverse fully. Problem Given...
Van Emde Boas Layout Search
Van Emde Boas Layout Search Van Emde Boas layout search stores a binary search tree in recursive memory order. The layout is cache oblivious, which means it does not need to know the cache line size or memory block size in advance. The main idea is to split the tree into a top subtree and several bottom subtrees, then store each part recursively. Nodes that are close in the search...
Branchless Binary Search
Branchless Binary Search Branchless binary search is a binary search variant designed to reduce branch mispredictions. A normal binary search repeatedly branches on whether the middle value is less than the target. On modern CPUs, this branch can be hard to predict because the search direction depends on the data. A branchless version keeps the same comparison logic but updates the search position using conditional moves, masks, or arithmetic expressions....
Galloping Intersection Search
Galloping Intersection Search Galloping intersection search computes the intersection of two sorted sequences by combining linear scanning with exponential jumps. When one sequence advances much faster than the other, the algorithm uses exponential search to skip ahead, then refines with binary search. This reduces unnecessary comparisons when the sequences differ significantly in size or distribution. Problem Given two sorted arrays $A$ and $B$, compute all elements that appear in both:...
Eytzinger Layout Search
Eytzinger Layout Search Eytzinger layout search reorganizes a sorted array into a breadth first tree layout, also known as heap order. The goal is to improve cache locality and make memory access patterns more predictable. Instead of storing data in sorted order, the array is rearranged so that binary search follows a sequential memory pattern. This reduces cache misses and improves throughput on modern CPUs. Problem Given a sorted array...
SIMD Linear Search
SIMD Linear Search SIMD linear search is a linear search variant that compares several values at the same time using vector instructions. SIMD means single instruction, multiple data. Instead of comparing one array element per loop iteration, the algorithm loads a block of elements into a vector register and compares all lanes against the target in one operation. The logical algorithm remains linear search. The speedup comes from processing multiple...
Interpolation Sequential Search
Interpolation Sequential Search Interpolation sequential search combines estimation and local scanning. It first predicts where the target should lie using interpolation, then performs a short sequential search around that estimate. This method is effective when values are approximately uniformly distributed and when exact positioning may be noisy or approximate. Problem Given a sorted array $A$ and a target $x$, find an index $i$ such that $$ A[i] = x $$...
Cache Aware Binary Search
Cache Aware Binary Search Cache aware binary search modifies ordinary binary search to account for the memory hierarchy. The algorithm still uses ordering and comparisons, but the data layout or search strategy is chosen with explicit knowledge of cache line size, block size, or page size. The goal is to reduce cache misses. On modern hardware, a comparison is cheap, while an unpredictable memory load can be expensive. Problem Given...
Recursive Model Index Search
Recursive Model Index Search Recursive model index search uses several learned models arranged as a hierarchy. A top model selects a lower model, and the lower model predicts the likely position of a key in sorted data. The final prediction is then corrected by searching inside a bounded local range. This is the main search procedure behind a recursive model index, often shortened to RMI. It replaces part of a...
Branchless Lower Bound
Branchless Lower Bound Branchless lower bound computes the first position where a value is greater than or equal to a target, without using unpredictable branches. It follows the same logical structure as binary search but replaces control flow with arithmetic updates or conditional moves. The result is the same as standard lower bound. The benefit is improved performance on modern CPUs when branch misprediction is expensive. Problem Given a sorted...
Merge Path Search
Merge Path Search Merge path search finds partition points in two sorted arrays so that a merge can be split into independent chunks. It is mainly used for parallel merge, GPU merge, and distributed sorted data processing. The method views merging as a path through a grid. Each step consumes one element from either the first array or the second array. A diagonal in this grid represents a fixed number...
Interpolation Search with Fallback
Interpolation Search with Fallback Interpolation search with fallback combines the speed of interpolation search on well distributed numeric keys with the reliability of binary search. It estimates where the target should be by value, but falls back to ordinary binary search when the estimate becomes poor, unstable, or unsafe. This gives good performance on near uniform data while preserving the worst case behavior of binary search. Problem Given a sorted...
Learned Index Search
Learned Index Search Learned index search treats an ordered index as a prediction problem. Given a key, a model estimates the position where that key should appear in sorted data. The algorithm then corrects the estimate by searching a small range around the predicted position. The main idea is that a sorted array defines a cumulative distribution function from key to position. If this function is predictable, a model can...
Finger Search
Finger Search Finger search exploits locality. Instead of searching from the root or beginning, it starts from a known position called a finger and moves toward the target. The cost depends on the distance between the finger and the target, not the total size of the structure. This method appears in ordered data structures such as balanced trees, skip lists, and sorted arrays with auxiliary links. Problem Given an ordered...
SIMD Binary Search
SIMD Binary Search SIMD binary search accelerates binary search by evaluating multiple candidate comparisons in parallel. Instead of checking a single midpoint per step, the algorithm probes several positions using vector instructions and narrows the search range based on the combined results. The structure still follows binary search, but each iteration performs more comparisons per memory access. Problem Given a sorted array $A$ of length $n$ and a target value...