Pattern Defeating Quicksort

Pattern Defeating Quicksort, often called pdqsort, is a modern quicksort variant designed to handle structured and adversarial inputs efficiently. It improves on classical quicksort by detecting patterns that lead to bad partitions and adapting its behavior.

It combines fast partitioning, smart pivot selection, and fallback strategies.

Problem

Given an array $A$ of length $n$, reorder it such that:

$$ A[0] \le A[1] \le \dots \le A[n-1] $$

Idea

Standard quicksort can degrade on inputs such as sorted arrays, reverse sorted arrays, or arrays with many equal elements. pdqsort addresses this with:

technique	purpose
adaptive pivot selection	avoid consistently bad pivots
pattern detection	detect already sorted or nearly sorted data
branchless partitioning	reduce branch misprediction
fallback strategy	switch to heapsort if needed

It also uses insertion sort for small partitions.

Algorithm

pdqsort(A, lo, hi, depth_limit):
    if hi - lo < threshold:
        insertion_sort(A, lo, hi)
        return

    if depth_limit == 0:
        heapsort_range(A, lo, hi)
        return

    pivot = choose_pivot(A, lo, hi)

    if is_bad_partition:
        shuffle_elements(A)

    p = partition(A, lo, hi, pivot)

    pdqsort(A, lo, p - 1, depth_limit - 1)
    pdqsort(A, p + 1, hi, depth_limit - 1)

Pivot selection often uses median of three or similar heuristics.

Example

Let:

$$ A = [1,2,3,4,5,6,7,8] $$

Standard quicksort may choose poor pivots and produce unbalanced recursion. pdqsort detects that the array is already ordered and avoids unnecessary work, often finishing in linear time.

Correctness

Partitioning ensures that elements are divided correctly around the pivot. Adaptive strategies change how pivots are selected or when fallback algorithms are used, but do not change the correctness of partitioning or recursion. Therefore the algorithm still produces a sorted array.

Complexity

case	time
best	$O(n)$
average	$O(n \log n)$
worst	$O(n \log n)$

The worst case is bounded by fallback to heapsort.

Space:

case	space
average	$O(\log n)$
worst	$O(n)$

Properties

property	value
stable	no
in-place	yes
adaptive	yes
worst case guarantee	yes
practical speed	very high

Notes

pdqsort is one of the fastest general purpose comparison sorts in practice. It is used in high performance libraries, including implementations in C++ ecosystems.

It is especially strong on partially sorted data and avoids the common pitfalls of naive quicksort.

Implementation

This simplified version shows the structure without low level optimizations such as branchless partitioning.

import math
import random

def pdqsort(a):
    n = len(a)
    if n <= 1:
        return a

    depth_limit = 2 * int(math.log2(n))

    def sort(lo, hi, depth):
        if hi - lo <= 16:
            insertion_sort(lo, hi)
            return

        if depth == 0:
            heap_sort_range(lo, hi)
            return

        pivot = median_of_three(lo, hi)
        p = partition(lo, hi, pivot)

        sort(lo, p - 1, depth - 1)
        sort(p + 1, hi, depth - 1)

    def median_of_three(lo, hi):
        mid = (lo + hi) // 2
        trio = [a[lo], a[mid], a[hi]]
        trio.sort()
        return trio[1]

    def partition(lo, hi, pivot):
        i, j = lo, hi

        while True:
            while a[i] < pivot:
                i += 1
            while a[j] > pivot:
                j -= 1
            if i >= j:
                return j
            a[i], a[j] = a[j], a[i]
            i += 1
            j -= 1

    def insertion_sort(lo, hi):
        for i in range(lo + 1, hi + 1):
            key = a[i]
            j = i - 1
            while j >= lo and a[j] > key:
                a[j + 1] = a[j]
                j -= 1
            a[j + 1] = key

    def heap_sort_range(lo, hi):
        b = a[lo:hi + 1]
        b.sort()
        a[lo:hi + 1] = b

    sort(0, n - 1, depth_limit)
    return a