Skip to content

LLVM as a Code-Generation Backend for Mochi

The workhorse SSA infrastructure, version 20 era, evaluated for MEP-42.

§1 Provenance

§2 Mechanism

LLVM consumes a typed, SSA, three-address intermediate representation (LLVM IR) and lowers it to machine code through a long pipeline: IR-level transforms (mem2reg, inlining, GVN, LICM, loop vectorization), a target-independent SelectionDAG or GlobalISel instruction selector, the register allocator (greedy with PBQP fallback), prologue/epilogue insertion, and target-specific assembly emission. Output can be assembly text, an object file (via the integrated assembler), or in-memory machine code via OrcJIT/MCJIT.

IR is dual-encoded: human-readable .ll text and compact .bc bitcode. Both round-trip losslessly. A typical embedded use case writes .bc, hands it to opt -O2 plus llc, and links with the system linker (ld, lld, link.exe). For JIT, OrcJIT v2 is the supported in-process compiler.

MLIR (covered separately in 09_mlir.md) sits above LLVM IR and can lower to it; LLVM IR remains the lingua franca for object emission.

§3 Target coverage (May 2026)

LLVM 20 has stable, production backends for:

  • x86_64 SysV (Linux, macOS, FreeBSD, the BSDs) and x86_64 Win64.
  • AArch64 on macOS (Apple Silicon), Linux, Windows on ARM.
  • RISC-V RV32I/RV64GC including vector extensions, stable for years.
  • WebAssembly (wasm32 and wasm64) with full SIMD and reference-types support.
  • Arm32, PowerPC LE, MIPS (in maintenance), SystemZ, SPARC, Hexagon, AMDGPU, NVPTX, BPF.

Object formats: ELF, Mach-O, COFF/PE, XCOFF, Wasm. DWARF 5 is the default debug format on non-Windows targets; CodeView on Windows.

Experimental in 20.x: LoongArch is now stable, M68k remains experimental, the Xtensa backend is in mainline. AVX10 support was rewritten after Intel’s mid-2025 spec change (https://www.phoronix.com/linux/LLVM).

§4 Production / language adoption status (May 2026)

LLVM IR is the de facto target for new statically typed languages: Rust (primary backend), Swift, Julia, Crystal, Nim (optional alongside its C backend), Pony, Odin, Mojo (via MLIR, then LLVM), and Zig (for release builds, while Debug now defaults to Zig’s self-hosted x86_64 backend, https://ziglang.org/devlog/2025/). It also remains the assembler driver for clang, flang, and most production C/C++ toolchains.

LLVM cuts a major release every six months. Active maintainership is healthy with hundreds of corporate and academic contributors (Apple, Google, Intel, AMD, ARM, Sony, AdaCore, Nvidia, Modular).

Performance: LLVM at -O2/-O3 is the reference point that every other backend gets compared to. Tradeoff is well known: very long compile times and a multi-hundred-MB toolchain.

§5 Engineering cost for Mochi

  • Binary footprint: A static libLLVM-20.a is roughly 1.2 GB unstripped; the shared .so/.dylib ships at 100-200 MB depending on enabled targets. A minimal AArch64+x86_64+RISC-V build can be trimmed to 60-80 MB. Cranelift, MIR, and QBE are 10x to 100x smaller.
  • Build complexity: For a Go-hosted compiler, the two realistic integration paths are:
    1. cgo with go-llvm (https://pkg.go.dev/tinygo.org/x/go-llvm): use TinyGo’s actively maintained C-API binding. Build tags select LLVM 14 through 20. Requires libLLVM installed on the build machine and cross-compilation toolchains for each target. Single-maintainer concern per https://discourse.llvm.org/t/go-llvm-bindings-choice-for-language-development/84592.
    2. Pure-Go llir/llvm (https://github.com/llir/llvm) emits LLVM IR text only; you then shell out to llc or opt. No cgo, no native build. The cost is an external llc dependency at runtime, which is the same shape as the Zig 0.16 plan (“emit bitcode, let user run llc separately,” https://github.com/ziglang/zig/issues/16270).
  • License: Apache 2.0 with LLVM Exceptions, compatible with Mochi’s permissive license.
  • Cross-compilation: First-class. A single LLVM build can target every supported triple; no per-host toolchain.
  • Debugging: Full DWARF 5 and CodeView, source-level debugging in gdb/lldb/WinDbg.
  • Runtime startup cost: AOT path has zero runtime cost. JIT path pays multi-millisecond startup per OrcJIT engine plus ~50-500ms per function compiled at -O0, seconds at -O2.

§6 Mochi adaptation note

The vm3 register file (three banks of typed registers) and the typed arena allocator in /Users/apple/github/mochilang/mochi/runtime/vm3/arenas.go map cleanly to LLVM IR: each typed register becomes an SSA value, each arena Cell becomes an i64/{i8*, i64} aggregate, and the existing op dispatch in /Users/apple/github/mochilang/mochi/runtime/vm3/op.go becomes a per-op IR template. The compiler3 IR under /Users/apple/github/mochilang/mochi/compiler3/ir is already SSA-flavored, so the lowering pass would live next to /Users/apple/github/mochilang/mochi/compiler3/emit. For Phase 1 the simplest route is to emit textual .ll from a new compiler3/emit/llvmir package and shell out to the system llc; this avoids cgo and matches the Zig direction of travel.

§7 Open questions for MEP-42

  • Cgo or not? Adopting go-llvm introduces a C toolchain dependency on every Mochi build host, which conflicts with Mochi’s current “pure Go, no native deps” stance (/Users/apple/github/mochilang/mochi has no cgo today outside runtime/tcc/Makefile). Emitting .ll text and shelling out to llc keeps Mochi portable but ties release binaries to an installed LLVM.
  • Which LLVM version to pin? Pin to LLVM 20 LTS for Phase 1; track LLVM 21/22 once the AVX10.2 churn settles.
  • JIT or AOT first? AOT via llc is the obvious Phase 1; OrcJIT in-process needs cgo and balloons the runtime to hundreds of MB, conflicting with the vm3 minimalist design.
  • Where does this leave vm3? Even with LLVM AOT, the vm3 interpreter stays as the development/REPL backend, mirroring Julia (interpreter for fast iteration, LLVM JIT for hot code).
  • MLIR or straight LLVM IR? MLIR adds another major dependency and another learning curve; for a simple imperative language like Mochi the direct IR route is enough until GPU or accelerator targets become a goal.