The Compile-Time vs Runtime Trade-Off

§1 Provenance

The trade-off is folklore across the JIT community; quantified in many published papers and blog posts. Key data points:
- Sparkplug: ~10x faster compile than TurboFan, code ~5-10x slower at runtime. V8 blog https://v8.dev/blog/sparkplug.
- Liftoff: ~15-20x faster compile than TurboFan, code ~1.1-1.5x slower (Sightglass benchmarks). https://v8.dev/blog/liftoff.
- Copy-and-Patch: ~100x faster compile than LLVM, code ~2x slower at runtime. Xu/Kjolstad PLDI 2021, https://fredrikbk.com/publications/copy-and-patch.pdf.
- JSC tier ratios: DFG ~4x Baseline compile time; FTL ~6x DFG compile time. Pizlo, https://webkit.org/blog/10308/speculation-in-javascriptcore/.
- Cranelift vs LLVM: ~2-3x faster compile, ~10-30% slower code. Bytecode Alliance retrospective.
- QBE vs LLVM: ~5-10x faster compile, ~30% slower code (project's own claim: "70% of the performance of advanced compilers in 10% of the code"). https://c9x.me/compile/.
Mochi-specific bench numbers: MEP-17 (vm2 micro-benchmarks) and MEP-23 (compiler1 codegen budgets), referenced from /Users/apple/notes/Spec/ (not enumerated here since they are internal).

§2 Technique / contribution

The conceptual model:

quality of generated code (runtime perf)
  ^
  |                                          * LLVM-O3, B3, TurboFan
  |                                    * Cranelift, FTL
  |                              * QBE
  |                        * Cranelift-fast, DFG
  |                  * Liftoff, Maglev
  |             * Sparkplug, JSC Baseline
  |       * Copy-and-Patch
  |  * Interpreter (no compile)
  +-------------------------------------------------------> compile speed

The right-hand side is "compile instantly, run okay". The left-hand side is "compile slowly, run fast". MEP-42 is a phase-1 effort; the user explicitly said "naive". That puts us on the right-hand side.

Three concrete choices to evaluate:

Copy-and-patch (CPython 3.13 model): compile speed ~10 MB/s of source equivalent; code ~2x slower than LLVM -O2; ~2-3x faster than vm3 interpreter for hot loops.
Per-opcode template JIT (Sparkplug, Liftoff style): compile speed ~5 MB/s; code ~1.5-2x slower than optimized; ~3-5x faster than vm3 interpreter.
chibicc-style single-pass AOT to GAS asm + shell out to cc: compile speed limited by cc (~1-2 MB/s); code ~3-5x slower than optimized (but cc does some opt for us); ~3x faster than vm3 interpreter.
QBE backend (textual SSA + shell out): compile speed ~3-5 MB/s; code ~30% slower than LLVM -O2; ~5x faster than vm3 interpreter.

§3 Where it shines, where it fails

The trade-off curve is not a single number; it depends on workload:

Short-lived programs (CLI tools, scripts, build helpers):

Compile time dominates total time.
Copy-and-patch and Sparkplug shine.
chibicc-style AOT loses because cc invocation is slow.

Long-running programs (servers, batch jobs, data pipelines):

Runtime dominates.
QBE, Cranelift, or LLVM shine.
Copy-and-patch and Sparkplug leave performance on the table.

Mixed workloads (typical for Mochi: interactive REPL plus running production handlers):

A tiered strategy wins: ship a baseline JIT for cold code, an optimizing tier for hot code.
But phase 1 of MEP-42 cannot be tiered; we pick one point.

§4 Status (May 2026)

Every major dynamic-language VM is tiered (V8, JSC, SpiderMonkey, BEAM, HotSpot, OpenJ9, .NET CoreCLR, Wasmtime).
Recent research (Lesbre/Lemerre PLDI 2024) shows even baseline JITs can be 30% faster with cheap abstract interpretation.
For static languages with AOT-only output (Go, Rust, OCaml), the trade-off is one-sided: only runtime perf matters because compile time is amortized over many runs.
Mochi sits between these worlds. The vm3 interpreter exists, so we have a runtime tier; MEP-42 is asking us to add a compile-to-native tier.

§5 Engineering cost for Mochi

Back-of-envelope budget from MEP-17 and MEP-23:

vm3 interpreter executes Mochi bench programs at roughly the speed of CPython 3.13 (within 1.5x, based on MEP-17 numbers).
Copy-and-patch JIT for CPython yielded 2-9% over interpreter. Mochi would likely see similar (let's call it 5-15%).
A per-opcode template JIT (Sparkplug-style) would likely yield 30-50% improvement, based on V8's measured 5-15% on optimized JS interpreter that is much better than vm3.
A QBE-via-textual-SSA emission would likely yield 2-5x improvement on numeric workloads.
An LLVM/MLIR-based optimizing tier would yield 5-10x on hot loops.

Compile-time budget: MEP-23 set an informal target of "compile a 10,000-line Mochi program in under 1 second on a 2024 laptop." This rules out LLVM/MLIR for phase 1. It comfortably fits copy-and-patch and template JIT; it tightly fits QBE; it slightly exceeds chibicc-style shell-to-cc (cc on 10k LOC is closer to 2-5 seconds).

Recommendation by workload:

For Mochi REPL + interactive use: copy-and-patch (fastest compile, decent run).
For Mochi scripts: template JIT (compile once per run, decent code).
For Mochi long-running services: QBE (slower compile, 5x better code).
For Mochi AOT binaries we ship to users: chibicc-style + cc -O2. Slow compile, fastest code that doesn't require LLVM.

§6 Mochi adaptation note

runtime/vm3/ is the existing baseline (no compile cost; interpreter speed).
A phase-1 naive emitter should target the fastest compile time point on the curve. That argues for copy-and-patch.
A phase-2 optimizing emitter should aim for QBE-equivalent quality.
The vm3 interpreter remains the always-available fallback for unsupported ops or platforms.

§7 Open questions for MEP-42

What is the right phase-1 point on the curve? My recommendation (see naive/00_naive_summary.md): copy-and-patch.
Do we want to support multiple emitters simultaneously, selected per file or per function? (Yes, eventually.)
How do we measure "naive code is good enough"? What is the Mochi MEP-42 acceptance benchmark?
Should we bench against CPython 3.13 (interpreter), CPython 3.13 with JIT, Go, and Rust on Mochi corpus equivalents?
Tiering: phase 1 is single-tier. When do we add a second tier?

§8 References

Sparkplug numbers: https://v8.dev/blog/sparkplug.
Liftoff numbers: https://v8.dev/blog/liftoff.
Copy-and-Patch paper: https://fredrikbk.com/publications/copy-and-patch.pdf.
CPython 3.13 JIT report (Brandt Bucher): https://peps.python.org/pep-0744/.
Cranelift/Wasmtime 2023 retrospective: https://bytecodealliance.org/articles/wasmtime-and-cranelift-in-2023.
QBE design notes: https://c9x.me/compile/.
Pizlo on JSC speculation: https://webkit.org/blog/10308/speculation-in-javascriptcore/.
Lesbre/Lemerre baseline-JIT-with-AI evaluation: https://inria.hal.science/hal-05407834v1/document.
Titzer "Whose Baseline Compiler Is It Anyway?" CGO 2024: https://arxiv.org/pdf/2305.13241.