Pattern Defeating Quicksort

Quicksort variant that detects and avoids bad input patterns using adaptive partitioning and pivot selection.

Pattern Defeating Quicksort

Pattern Defeating Quicksort, often called pdqsort, is a modern quicksort variant designed to handle structured and adversarial inputs efficiently. It improves on classical quicksort by detecting patterns that lead to bad partitions and adapting its behavior.

It combines fast partitioning, smart pivot selection, and fallback strategies.

Problem

Given an array $A$ of length $n$, reorder it such that:

$$ A[0] \le A[1] \le \dots \le A[n-1] $$

Idea

Standard quicksort can degrade on inputs such as sorted arrays, reverse sorted arrays, or arrays with many equal elements. pdqsort addresses this with:

technique purpose
adaptive pivot selection avoid consistently bad pivots
pattern detection detect already sorted or nearly sorted data
branchless partitioning reduce branch misprediction
fallback strategy switch to heapsort if needed

It also uses insertion sort for small partitions.

Algorithm

pdqsort(A, lo, hi, depth_limit):
    if hi - lo < threshold:
        insertion_sort(A, lo, hi)
        return

    if depth_limit == 0:
        heapsort_range(A, lo, hi)
        return

    pivot = choose_pivot(A, lo, hi)

    if is_bad_partition:
        shuffle_elements(A)

    p = partition(A, lo, hi, pivot)

    pdqsort(A, lo, p - 1, depth_limit - 1)
    pdqsort(A, p + 1, hi, depth_limit - 1)

Pivot selection often uses median of three or similar heuristics.

Example

Let:

$$ A = [1,2,3,4,5,6,7,8] $$

Standard quicksort may choose poor pivots and produce unbalanced recursion. pdqsort detects that the array is already ordered and avoids unnecessary work, often finishing in linear time.

Correctness

Partitioning ensures that elements are divided correctly around the pivot. Adaptive strategies change how pivots are selected or when fallback algorithms are used, but do not change the correctness of partitioning or recursion. Therefore the algorithm still produces a sorted array.

Complexity

case time
best $O(n)$
average $O(n \log n)$
worst $O(n \log n)$

The worst case is bounded by fallback to heapsort.

Space:

case space
average $O(\log n)$
worst $O(n)$

Properties

property value
stable no
in-place yes
adaptive yes
worst case guarantee yes
practical speed very high

Notes

pdqsort is one of the fastest general purpose comparison sorts in practice. It is used in high performance libraries, including implementations in C++ ecosystems.

It is especially strong on partially sorted data and avoids the common pitfalls of naive quicksort.

Implementation

This simplified version shows the structure without low level optimizations such as branchless partitioning.

import math
import random

def pdqsort(a):
    n = len(a)
    if n <= 1:
        return a

    depth_limit = 2 * int(math.log2(n))

    def sort(lo, hi, depth):
        if hi - lo <= 16:
            insertion_sort(lo, hi)
            return

        if depth == 0:
            heap_sort_range(lo, hi)
            return

        pivot = median_of_three(lo, hi)
        p = partition(lo, hi, pivot)

        sort(lo, p - 1, depth - 1)
        sort(p + 1, hi, depth - 1)

    def median_of_three(lo, hi):
        mid = (lo + hi) // 2
        trio = [a[lo], a[mid], a[hi]]
        trio.sort()
        return trio[1]

    def partition(lo, hi, pivot):
        i, j = lo, hi

        while True:
            while a[i] < pivot:
                i += 1
            while a[j] > pivot:
                j -= 1
            if i >= j:
                return j
            a[i], a[j] = a[j], a[i]
            i += 1
            j -= 1

    def insertion_sort(lo, hi):
        for i in range(lo + 1, hi + 1):
            key = a[i]
            j = i - 1
            while j >= lo and a[j] > key:
                a[j + 1] = a[j]
                j -= 1
            a[j + 1] = key

    def heap_sort_range(lo, hi):
        b = a[lo:hi + 1]
        b.sort()
        a[lo:hi + 1] = b

    sort(0, n - 1, depth_limit)
    return a