MLIR as a Code-Generation Backend for Mochi

§1 Provenance

Project home: https://mlir.llvm.org/
Source: https://github.com/llvm/llvm-project/tree/main/mlir
Users page: https://mlir.llvm.org/users/
Original paper: Chris Lattner et al., "MLIR: Scaling Compiler Infrastructure for Domain Specific Computation," CGO 2021.
Mojo MLIR docs: https://docs.modular.com/mojo/notebooks/BoolMLIR/
Mojo vision: https://docs.modular.com/mojo/vision/
Awesome-MLIR/Mojo list: https://github.com/coderonion/awesome-mojo-max-mlir
Mojo source-language Wikipedia: https://en.wikipedia.org/wiki/Mojo_(programming_language)
Mojo HPC paper (SC25 workshops): https://arxiv.org/pdf/2509.21039

§2 Mechanism

MLIR is an IR framework rather than a single IR. It provides:

A common SSA infrastructure (Operations, Regions, Blocks, Values, Types, Attributes).
A dialect mechanism: each dialect defines its own ops, types, and attributes; many dialects can coexist in a single module.
Standard dialects: func, arith, scf (structured control flow), cf, memref, tensor, vector, affine, linalg, gpu, llvm (the LLVM IR dialect), and many more.
Pattern-based rewriting and progressive lowering: a frontend emits ops in a high-level dialect, then conversion passes lower them step-by-step through standard dialects, ultimately to the llvm dialect, which translates to LLVM IR for backend code generation.

The "code generation" itself still goes through LLVM (or, for accelerators, SPIR-V or PTX). MLIR's value is in the high-level optimization passes (loop fusion, tiling, vectorization at the affine level) before LLVM ever sees the code.

For Mochi's purposes, MLIR is best understood as LLVM plus extra layers. Build complexity, binary size, and target coverage are all LLVM's, plus more.

§3 Target coverage (May 2026)

MLIR itself targets:

All LLVM targets via the llvm dialect (x86_64, AArch64, RISC-V, Wasm, PowerPC, etc.).
GPU dialects lower to NVPTX, AMDGPU, and SPIR-V.
Wasm dialect: proposed, in flight as of CGO 2025 (per https://mlir.llvm.org/users/).
Custom hardware: TPU, Apple Neural Engine, AWS Trainium/Inferentia, custom NPUs via dialect-specific backends.

Object formats: inherit from LLVM (ELF, Mach-O, COFF, Wasm).

What is stable as of LLVM 20: the core infrastructure, the standard dialects, the LLVM lowering. What is in flux: GPU async dialects, transform dialect (introduced CGO 2025), Wasm dialect, the Clang CIR integration (a new C/C++ frontend that lowers via MLIR).

§4 Production / language adoption status (May 2026)

Mojo (Modular): the highest-profile MLIR consumer. Mojo is described as "syntactic sugar for MLIR." Mojo's compiler is closed-source through May 2026; Modular has committed to open-sourcing it in fall 2026. Mojo standard library is open.
Modular MAX: production AI graph compiler that hosts Mojo kernels.
IREE (https://iree.dev): runtime for ML models, lowers through MLIR.
OpenXLA: TensorFlow/JAX/PyTorch shared compiler stack, MLIR-based, backed by NVIDIA, AMD, Intel, Apple, AWS.
CIRCT: hardware description toolchain (Verilog generation).
Clang CIR: an experimental Clang IR dialect, lowering C/C++ via MLIR.
Flang: the Fortran frontend uses MLIR (FIR dialect → HLFIR → LLVM IR).
Polygeist: C-to-MLIR (research).
Polymage / Tiramisu descendants for polyhedral compilation.

Maintainership is healthy with significant Google and Modular backing. Release cadence matches LLVM's six-month train.

License: Apache 2.0 with LLVM Exceptions.

§5 Engineering cost for Mochi

Binary footprint: MLIR adds tens of MB on top of LLVM. A typical "MLIR + LLVM" binary is 150-300 MB.
Build complexity: same as LLVM, plus another sub-project (-DLLVM_ENABLE_PROJECTS=mlir). No Go binding. Integration via cgo to a C++ wrapper, or via emitting MLIR text and shelling out to mlir-opt | mlir-translate | llc.
License: Apache 2.0 with LLVM Exceptions.
Cross-compilation: same as LLVM (single build targets all triples).
Debugging: full DWARF via LLVM; MLIR also has location attributes that flow through lowering.
Runtime startup: hundreds of ms to construct the MLIR context, similar to LLVM.

For a small imperative language like Mochi, MLIR is massive overkill. MLIR's value emerges when you have many lowering levels (e.g., tensor programs → linalg → affine → scf → llvm). Mochi's compiler3 IR is already low-level; lowering it through MLIR adds layers without unlocking new optimizations.

§6 Mochi adaptation note

Mochi's compiler3 IR (/Users/apple/github/mochilang/mochi/compiler3/ir) would skip most of MLIR's standard dialects and lower almost directly into the llvm dialect. At that point we have all the cost of MLIR with none of the benefit; we might as well emit LLVM IR text directly (see 01_llvm.md).

The only Mochi scenarios where MLIR earns its weight:

Mochi grows a tensor/array dialect for ML workloads. Mochi's current runtime/vector and runtime/llm modules hint at this direction.
Mochi wants to target GPUs or custom accelerators.
Mochi adopts polyhedral or vectorization passes that need affine and linalg.

None of these is a Phase 1 concern.

§7 Open questions for MEP-42

Verdict for Phase 1: skip. MLIR's cost is justified only if Mochi targets heterogeneous compute.
Long-term hedge: if Mochi grows tensor support, MLIR becomes the obvious choice for that subsystem; the rest of the language can keep emitting plain LLVM IR.
Closed-source Mojo precedent: Modular has shown that you can build a serious systems language as an MLIR frontend. The downside is two-axis evolution: Mochi would track both LLVM and MLIR release trains.
Dialect maintenance: a Mochi dialect would need a TableGen description and ongoing conformance with MLIR core. Not free.
Wasm dialect timing: if the Wasm dialect lands stable in LLVM 22 or 23, MLIR becomes a single answer for "all of Mochi's targets" (LLVM CPU triples + Wasm). Watch this.