Minimal Perfect Hashing Lookup

Look up a key using a collision free hash function that maps a static key set to exactly n table positions.

Minimal Perfect Hashing Lookup

Minimal perfect hashing is a static dictionary technique. It builds a hash function for a fixed set of $n$ keys so that every stored key maps to a unique integer in the range $0$ to $n - 1$.

Unlike ordinary perfect hashing, a minimal perfect hash uses exactly one table slot per key. Lookup computes the hash position, checks the stored key, and returns the associated value if the key matches.

Problem

Given a fixed key set $S$ with $n$ keys and a query key $k$, return the associated value if $k \in S$.

Model

A minimal perfect hash function for $S$ is a function $h$ such that:

$$ h: S \to {0, 1, \dots, n - 1} $$

and for all distinct keys $a, b \in S$:

$$ h(a) \neq h(b) $$

The table stores each key $k \in S$ at:

$$ T[h(k)] $$

Algorithm

minimal_perfect_hash_lookup(T, k):
    i = h(k)

    if i < 0 or i >= length(T):
        return NOT_FOUND

    if T[i].key == k:
        return T[i].value

    return NOT_FOUND

The final key comparison is required because the hash function is collision free only for the stored set $S$. A query key outside $S$ may still map to some valid position.

Example

Let:

$$ S = {\text{"cat"}, \text{"dog"}, \text{"fox"}, \text{"owl"}} $$

A minimal perfect hash function might assign:

key index
cat 2
dog 0
fox 3
owl 1

The table has exactly four positions:

index stored key
0 dog
1 owl
2 cat
3 fox

Lookup for "cat":

  • compute $h(\text{"cat"}) = 2$
  • check table position 2
  • key matches
  • return the value

Lookup for "cow":

  • compute $h(\text{"cow"})$
  • check that position
  • return not found if the stored key differs

Correctness

For every stored key $k \in S$, the minimal perfect hash function assigns a unique position $h(k)$ in the table. Therefore, lookup for a stored key computes the exact position where that key was placed.

If the slot contains a different key, then the query key is outside $S$, because no two stored keys share a position.

Complexity

operation time
lookup $O(1)$
worst lookup $O(1)$

Lookup uses a constant number of hash operations and table reads. Construction may be more expensive, but construction is outside the lookup operation.

Space

$$ O(n) $$

The value table has exactly $n$ slots. The hash function itself also requires auxiliary metadata, whose size depends on the construction method.

When to Use

Minimal perfect hashing is appropriate when:

  • the key set is static
  • memory efficiency matters
  • lookup speed matters
  • duplicate keys are not allowed
  • rebuilds are acceptable when the key set changes

It is common in compilers, keyword tables, static dictionaries, routing tables, and read only indexes.

Implementation

class MinimalPerfectHashTable:
    def __init__(self, table, hash_func):
        self.table = table
        self.hash_func = hash_func

    def get(self, key):
        i = self.hash_func(key)

        if i < 0 or i >= len(self.table):
            return None

        k, v = self.table[i]
        if k == key:
            return v

        return None
type Entry[K comparable, V any] struct {
	Key   K
	Value V
}

type MinimalPerfectHashTable[K comparable, V any] struct {
	Table []Entry[K, V]
	Hash  func(K) int
}

func (h *MinimalPerfectHashTable[K, V]) Get(key K) (V, bool) {
	i := h.Hash(key)

	if i < 0 || i >= len(h.Table) {
		var zero V
		return zero, false
	}

	e := h.Table[i]
	if e.Key == key {
		return e.Value, true
	}

	var zero V
	return zero, false
}