Minimal Perfect Hashing Lookup
Look up a key using a collision free hash function that maps a static key set to exactly n table positions.
Minimal Perfect Hashing Lookup
Minimal perfect hashing is a static dictionary technique. It builds a hash function for a fixed set of $n$ keys so that every stored key maps to a unique integer in the range $0$ to $n - 1$.
Unlike ordinary perfect hashing, a minimal perfect hash uses exactly one table slot per key. Lookup computes the hash position, checks the stored key, and returns the associated value if the key matches.
Problem
Given a fixed key set $S$ with $n$ keys and a query key $k$, return the associated value if $k \in S$.
Model
A minimal perfect hash function for $S$ is a function $h$ such that:
$$ h: S \to {0, 1, \dots, n - 1} $$
and for all distinct keys $a, b \in S$:
$$ h(a) \neq h(b) $$
The table stores each key $k \in S$ at:
$$ T[h(k)] $$
Algorithm
minimal_perfect_hash_lookup(T, k):
i = h(k)
if i < 0 or i >= length(T):
return NOT_FOUND
if T[i].key == k:
return T[i].value
return NOT_FOUND
The final key comparison is required because the hash function is collision free only for the stored set $S$. A query key outside $S$ may still map to some valid position.
Example
Let:
$$ S = {\text{"cat"}, \text{"dog"}, \text{"fox"}, \text{"owl"}} $$
A minimal perfect hash function might assign:
| key | index |
|---|---|
| cat | 2 |
| dog | 0 |
| fox | 3 |
| owl | 1 |
The table has exactly four positions:
| index | stored key |
|---|---|
| 0 | dog |
| 1 | owl |
| 2 | cat |
| 3 | fox |
Lookup for "cat":
- compute $h(\text{"cat"}) = 2$
- check table position 2
- key matches
- return the value
Lookup for "cow":
- compute $h(\text{"cow"})$
- check that position
- return not found if the stored key differs
Correctness
For every stored key $k \in S$, the minimal perfect hash function assigns a unique position $h(k)$ in the table. Therefore, lookup for a stored key computes the exact position where that key was placed.
If the slot contains a different key, then the query key is outside $S$, because no two stored keys share a position.
Complexity
| operation | time |
|---|---|
| lookup | $O(1)$ |
| worst lookup | $O(1)$ |
Lookup uses a constant number of hash operations and table reads. Construction may be more expensive, but construction is outside the lookup operation.
Space
$$ O(n) $$
The value table has exactly $n$ slots. The hash function itself also requires auxiliary metadata, whose size depends on the construction method.
When to Use
Minimal perfect hashing is appropriate when:
- the key set is static
- memory efficiency matters
- lookup speed matters
- duplicate keys are not allowed
- rebuilds are acceptable when the key set changes
It is common in compilers, keyword tables, static dictionaries, routing tables, and read only indexes.
Implementation
class MinimalPerfectHashTable:
def __init__(self, table, hash_func):
self.table = table
self.hash_func = hash_func
def get(self, key):
i = self.hash_func(key)
if i < 0 or i >= len(self.table):
return None
k, v = self.table[i]
if k == key:
return v
return None
type Entry[K comparable, V any] struct {
Key K
Value V
}
type MinimalPerfectHashTable[K comparable, V any] struct {
Table []Entry[K, V]
Hash func(K) int
}
func (h *MinimalPerfectHashTable[K, V]) Get(key K) (V, bool) {
i := h.Hash(key)
if i < 0 || i >= len(h.Table) {
var zero V
return zero, false
}
e := h.Table[i]
if e.Key == key {
return e.Value, true
}
var zero V
return zero, false
}