PLDI 2024 and 2025: Relevant Code-Generation Papers

§1 Provenance

PLDI 2024: 45th ACM SIGPLAN Conference on Programming Language Design and Implementation, Copenhagen, Denmark, June 24-28 2024. Proceedings in PACMPL vol 8, no PLDI: https://dl.acm.org/toc/pacmpl/2024/8/PLDI.
PLDI 2025: 46th edition, Seoul, South Korea, June 16-20 2025. Proceedings in PACMPL vol 9, no PLDI: https://dl.acm.org/toc/pacmpl/2025/9/PLDI.
Conference pages: https://pldi24.sigplan.org/ and https://pldi25.sigplan.org/.

§2 Technique / contribution

Summarized below are six papers from PLDI 2024-2025 most relevant to MEP-42's "naive native codegen" charter.

1. Lesbre & Lemerre, "Compiling with Abstract Interpretation" (PLDI 2024)

Authors: Dorian Lesbre, Matthieu Lemerre (CEA LIST, France).
DOI: 10.1145/3656447 (PACMPL vol 8, PLDI).
Idea: Run an abstract interpreter at compile time over the bytecode being compiled, capturing constant facts, value ranges, and register-availability invariants. Use these facts to specialize the emitted machine code without building a full optimizing IR.
Result: Reduces emitted-code size by 12% (geomean), increases execution speed up to 30%, without harming compile-time overhead vs a direct template translator.
Follow-up "Are Abstract-interpreter Baseline JITs Worth it?" (HAL preprint at https://inria.hal.science/hal-05407834v1/document) implemented several baseline-JIT variants for the Pharo Smalltalk VM and confirmed the technique pays off in a production-style setting.
Mochi relevance: a "naive baseline + tiny abstract interpreter" combo could give us Mochi-quality codegen at very low compile-time cost.

2. SuperStack: Albert et al., "Superoptimization of Stack-Bytecode via Greedy, Constraint-Based, and SAT Techniques" (PLDI 2024)

Authors: Elvira Albert, Pablo Gordillo, et al. (Complutense University of Madrid).
ACM DOI: 10.1145/3656435 (PACMPL vol 8, PLDI).
Idea: Superoptimize stack-bytecode (WebAssembly, EVM) by combining a greedy presolver, constraint-based reasoning, and SAT search. The greedy phase tightens the bound on the length of any equivalent optimal sequence; constraint and SAT phases search within that bound.
Result: ~4x reduction in optimization time on 500,000 sample sequences, with greatly increased optimization gains vs prior superoptimizers.
Mochi relevance: Mochi vm3 bytecode is stack-tagged in places. SuperStack's algorithm could mine canonical replacements for common Mochi op sequences as a build-time peephole pass. Cheap to integrate.

3. Stratton et al., "Optimistic Stack Allocation and Dynamic Heapification for Managed Runtimes" (PLDI 2024)

Authors: Aditya Anand, et al. (IIT Madras).
Idea: Combine static escape analysis with runtime "heapification" hooks. Objects start on the stack; if they would escape via a feature like a captured closure, the runtime moves them to the heap dynamically.
Result: Demonstrated on a managed runtime; non-trivial speedup with bounded runtime overhead.
Mochi relevance: Mochi's arena allocator could benefit from this: stack-promote where possible, fall back to arena allocation when escape is dynamic. Phase 2 or 3 of MEP-42.

4. Bansal, Sharlet, Ragan-Kelley, Amarasinghe, "Lightweight and Locality-Aware Composition of Black-Box Subroutines" (PLDI 2025)

DOI: 10.1145/3729292.
Preprint: https://dspace.mit.edu/bitstream/handle/1721.1/164683/3729292.pdf.
System name: Fern.
Idea: Compose library subroutines (think: dense linear algebra kernels) without a full heavyweight optimizer. Annotate subroutines with data-production/data-consumption patterns; Fern fuses across boundaries using only those annotations.
Result: Matches manually-fused hand-tuned libraries (Intel OneDNN, others) across multiple domains.
Mochi relevance: a path to "fast Mochi numerics" without committing to LLVM or MLIR. Annotate runtime/vm3 numeric kernels with Fern-style metadata; let a small Mochi-side composer fuse across boundaries.

5. Type-Constrained Code Generation with Language Models (PLDI 2025)

Site: https://pldi25.sigplan.org/details/pldi-2025-papers/25/.
Idea: Use LLMs to generate code constrained by static type information. Less relevant to MEP-42's "naive emitter" charter, but worth noting because it changes what "naive" means: in a future world, code generation may be partly LLM-driven and partly template-driven.
Mochi relevance: nothing immediate. Possible MEP-50+ direction.

6. From Batch to Stream: Automatic Generation of Online Algorithms (PLDI 2024)

Site: https://pldi24.sigplan.org/details/pldi-2024-papers/42/.
Idea: Compile batch-style algorithms into incremental/online versions automatically.
Mochi relevance: nothing directly for codegen; relevant if Mochi adds reactive or streaming primitives.

§3 Where it shines, where it fails

The Lesbre/Lemerre abstract-interpretation paper is the most directly applicable. It validates that adding ~500 LOC of analysis to a template emitter can claw back 30% perf. SuperStack is the cheapest add-on for peephole gains. Bansal et al. is the most ambitious; useful for Mochi's numerical and array-heavy workloads.

The Optimistic Stack Allocation paper is a phase-2 add-on rather than a phase-1 must-have.

Compile-time profile: the abstract-interpreter approach adds linear overhead, SuperStack is offline (build-time), Fern is build-time annotation plus link-time composition.

§4 Status (May 2026)

All papers are published; PACMPL DOIs are live.
No production deployments of these specific systems yet, but the techniques are being picked up:
- Pharo and Squeak Smalltalk VMs are experimenting with the Lesbre/Lemerre abstract interpretation approach.
- WebAssembly toolchains (binaryen, wasm-opt) have prior peephole superoptimization that SuperStack improves on.
- Fern is being upstreamed into TACO (Kjolstad's compiler) and TVM.

§5 Engineering cost for Mochi

Lesbre/Lemerre abstract interpreter for Mochi: ~3 weeks (a "small but smart" pass over compiler3/ir/).
SuperStack peephole rewriter: ~2 weeks if we treat it as offline tooling.
Fern-style fusion: ~6-8 weeks. Substantial but bounded.

For MEP-42 phase 1, none of these are required. They are all profitable phase-2 additions.

§6 Mochi adaptation note

compiler3/opt/ would house the abstract-interpretation pass.
compiler3/ir/ is the natural target for SuperStack-style peephole replacements (build-time tooling generates opt/peep_table.go).
runtime/vm3/ numeric kernels (arrays.go, bignum.go, lists.go) are the natural Fern annotation targets.

§7 Open questions for MEP-42

Should phase 1 include any of these PLDI 2024-2025 ideas, or strictly ship raw templates first?
For SuperStack, what is our equivalence-checking oracle? The vm3 interpreter is the natural ground truth.
For abstract interpretation, what is the lattice? Constants? Ranges? Type-set-with-arena?
For Fern, do we want to expose annotations to user code or keep them internal?

§8 References

PLDI 2024 papers track: https://pldi24.sigplan.org/track/pldi-2024-papers.
PLDI 2025 papers track: https://pldi25.sigplan.org/track/pldi-2025-papers.
PACMPL Vol 8 PLDI issue: https://dl.acm.org/toc/pacmpl/2024/8/PLDI.
PACMPL Vol 9 PLDI issue: https://dl.acm.org/toc/pacmpl/2025/9/PLDI.
Lesbre/Lemerre follow-up "Are Abstract-interpreter Baseline JITs Worth it?": https://inria.hal.science/hal-05407834v1/document.
Fern (Bansal et al.) preprint: https://dspace.mit.edu/bitstream/handle/1721.1/164683/3729292.pdf.
Pavel Panchekha's "Distinguished (for me) Papers of PLDI'25" blog: https://pavpanchekha.com/blog/pldi25.html.