MSWasm — Memory-Safe WebAssembly

§1 Provenance

Michael, Gollamudi, Bosamiya, Johnson, Denlinger, Disselkoen, Watt, Parno, Patrignani, Vassena, Stefan. "MSWasm: Soundly Enforcing Memory-Safe Execution of Unsafe Code." POPL 2023, PACMPL Vol. 7, Article 15 (not OOPSLA 2023; the prompt is in error).
Earlier position paper: Disselkoen et al., "Progressive Memory Safety for WebAssembly," 2019.
Follow-up: Iris-MSWasm, OOPSLA 2024 — Legoupil, Watt, Patrignani, Stefan, et al. — a fully mechanised foundation in Iris. https://dl.acm.org/doi/10.1145/3689722.
Toolchain: https://github.com/PLSysSec/ms-wasm (rWasm AOT compiler, mswasm-graal JIT, mswasm-llvm C→MSWasm).
Paper PDFs:
- https://cseweb.ucsd.edu/~dstefan/pubs/michael:2023:mswasm.pdf
- https://arxiv.org/pdf/2208.13583

§2 Mechanism

MSWasm extends Wasm with three things:

Segments. Linear regions of memory, allocated dynamically, that can only be accessed through handles. Each segment has a unique segment id. Bytes in segment memory carry a tag distinguishing "raw byte" from "part of a handle." Reading a half-handle as a byte (or vice versa) traps.
Handles. A handle is a 4-tuple (base, offset, bound, id). Operationally a fat pointer, like a CHERI capability:
- base and bound define the segment span the handle can address.
- offset is the current position inside the span.
- id uniquely identifies the original allocation; freeing one handle with id X kills all handles with id X (catches temporal-safety leaks).
- The 1.4 ABI choices for how to physically encode the handle (16 bytes, 32 bytes, packed bits…) are deliberately left to the compiler backend.
New instructions. τ.segload, τ.segstore, slice (restrict bounds), handle.add (shift offset), segalloc, segfree. None of these can grow a handle's authority — only restrict it.

Bounds checks happen on every segload/segstore. Temporal safety comes from a runtime allocator that records live handle ids; segfree(h) retires the id; any later segload using a handle with retired id traps.

The paper proves soundness in a linear-resource calculus and uses a novel "colored memory locations" abstraction to give compiler-backend-independent memory safety. Iris-MSWasm later mechanised this in Coq.

Implementations:

Pure software: every handle is 16 bytes; every memory access does an explicit bounds check. Overhead 22-198% depending on what's enforced.
Hardware capabilities (Arm CHERI): handles map to CHERI capabilities; bounds checks are hardware-enforced. Overhead drops to ~51%.

§3 Memory-safety property

Robust spatial safety (no OOB read/write across segment boundaries) plus temporal safety (no UAF via stale handles) plus provenance (you can't forge a handle from raw bytes — the byte tag enforces this).

This is the strongest memory-safety property of any Wasm proposal to date, and the POPL 2023 paper proves it formally.

§4 Production status (May 2026)

MSWasm is not in mainstream Wasm engines. No browser supports it. The reference implementations (rWasm, mswasm-graal) are research artefacts.
The CG-level WebAssembly standards track has not adopted MSWasm; the closest mainstream-track work is the memory64 and multiple memories proposals (see file 10), which provide weaker but cheaper abstractions.
However: MSWasm is influencing the standards track. The "segmented memory" CG discussions and the proposed "wasm-pointers" or "fat-pointers" proposals draw heavily from MSWasm. Iris-MSWasm gave the proposal the formal credibility to be taken seriously.
Industrial CHERI deployments (Microsoft CHERIoT, Arm Morello) are the most likely first home for MSWasm-style memory safety in production.

§5 Cost

Throughput. Software-only: 22% slowdown for spatial-only enforcement on PolyBenchC; 198% for full spatial+temporal. Hardware-assisted (CHERI): 51.7%.
Memory. Handles are 16 bytes (vs 4 bytes for a Wasm32 pointer), so handle-rich data structures pay 4× pointer overhead.
Binary size. New opcodes are short; the compiler-emitted bounds-check sequences inflate code by ~5-10%.

The numbers explain why mainstream Wasm hasn't shipped this: 22% is too much for general use, 198% is unthinkable, and CHERI hardware is not deployed at scale.

§6 Mochi adaptation note

MSWasm's lessons map exactly onto vm3's handle ABI (MEP-40 §6.1):

MSWasm concept	vm3 equivalent
Segment	Per-type arena (`kArenaList`, `kArenaString`, etc.)
Handle (base, offset, bound, id)	`Cell{arena_tag, generation, slab_index}`
Handle `id`	The `(arena_tag, slab_index)` pair
Generation field for use-after-free detection	The 12-bit `generation` on the Cell
Byte-tag distinguishing handle from data	Mochi has typed slots in the register banks, so this is implicit
`segload`/`segstore` bounds check	List/string accessor methods (`StringBytes`, `ListGet`, etc.) bounds-check today

So vm3's Cell is a software MSWasm handle, restricted to a fixed-size base+id (we don't store bound because each slab entry has a fixed structure). What we lack:

Sub-handle slicing with bounded authority. Today a Mochi list[T] slice is a fresh allocation. MSWasm's slice instruction restricts a handle in place. If we adopt Roc-style seamless slices (file 02), we should think of them as MSWasm-style restricted handles: same (arena_tag, slab_index), plus packed (start, len). The generation must match the parent's at deref time. This also gives us MSWasm's temporal safety on slices for free.
Explicit memory-safety claims for vm3. Right now MEP-40 §9 talks about reclamation, not memory safety. MEP-41 should claim, with a precise statement: "vm3 enforces spatial safety on all handle dereferences via accessor-layer bounds checks, and temporal safety via the generation field." That's the MSWasm property, written in our language.
Iris-MSWasm-style mechanisation (long-horizon). Not in scope for MEP-41, but the OOPSLA 2024 follow-up shows it is possible to formally mechanise this kind of memory-safety property. A future MEP-50 could attempt the same for vm3.

There's no conflict with Mochi's design ethos. We're already most of the way there; MEP-41 just needs to write down the invariant.

§7 Open questions for MEP-41

Should we add an explicit "byte-tag" pattern to vm3? Probably no — we never store handles inside slab backing slices; they only live in the typed register banks. The byte-tag invariant holds by construction.
The MSWasm 198% software overhead is the warning shot. How do we keep vm3's accessor-layer bounds checks cheap? Answer: rely on compiler3's static type info to elide checks on provably-in-bounds accesses, fall back to runtime check.
Is the generation field 12 bits enough for the lifetime of a long-running Mochi process? At 4096 reuses-per-slot before wrap, a hot slot could wrap in seconds. Need to either widen the generation field or rate-limit slot reuse.
Does adopting an MSWasm-shaped memory-safety claim affect Mochi's FFI story? If C code can mint raw pointers into our backing slices, the invariant breaks. The MEP needs to say: no FFI dereferences into arena backing.