HN Debrief

WATaBoy: JIT-Ing Game Boy Instructions to WASM Beats a Native Interpreter

  • Programming
  • Developer Tools
  • Web
  • Mobile

The post walks through WATaBoy, a Game Boy emulator that does not interpret each instruction one by one. Instead it builds WebAssembly code for hot Game Boy basic blocks at runtime and hands that to the browser’s WebAssembly engine, which then compiles it to native code. The author frames this as a way around iOS restrictions that block ordinary JIT compilation in App Store apps but still permit the browser stack to JIT JavaScript and WebAssembly. In the published numbers, that JIT-on-WASM design beats the project’s own native interpreter, though it still trails strong native emulators.

If you ship emulation, language runtimes, or other interpreter-heavy software into restricted environments like iOS, WebAssembly can be more than a portability layer. It can act as an allowed optimization path, but you still need to benchmark across browsers because engine quality varies a lot.

Discussion mood

Strongly positive. People were impressed by the engineering, especially for a student project, and liked the trick of using WebAssembly as an allowed JIT path on iOS. The pushback was mostly about framing and benchmarking, not the core idea.

Key insights

  1. 01

    Jamulator predicted this direction

    Andrew Kelley’s 2013 Jamulator writeup was cited as an earlier attempt to statically recompile console code that ran into a hard wall. Handwritten assembly and runtime-dependent control flow do not map cleanly into ahead-of-time LLVM IR. The value of WATaBoy is that it takes the path Jamulator ended up pointing toward. It recompiles only when runtime information makes the code shape clear, instead of forcing the whole program through a static pipeline that cannot reliably understand it.

    If you are trying to optimize old machine code or bytecode, do not assume ahead-of-time lifting will get you there. Design for runtime specialization around hot paths from the start.

      Attribution:
    • mikepurvis #1
  2. 02

    Interpreter overhead dwarfs WASM overhead

    The key performance framing here is that this is not “WASM beats native.” It is “a JIT beats an interpreter,” even when that JIT targets WebAssembly and then relies on a second JIT in the browser. One commenter put WebAssembly overhead in the rough range of tens of percent, while interpreter overhead can be an order of magnitude larger. That makes the result far less surprising and much more general. If you can trade repeated decode-and-dispatch for compiled basic blocks, you often win even through an extra layer.

    When you evaluate a runtime architecture, compare execution models before comparing implementation languages. A compiled path through WebAssembly may beat a native interpreter by a wide margin.

      Attribution:
    • ahartmetz #1 #2
    • grashalm #1
  3. 03

    The iOS loophole is specific

    Using WebAssembly as a JIT escape hatch on iOS works because Apple lets the browser stack optimize JavaScript and WebAssembly. That does not mean “iOS cannot do JIT” is universally true. A linked project, StikDebug, was raised as a counterexample, and the correction was that sideloaded apps with the get-task-allow entitlement live under different rules than normal App Store apps. The practical boundary is distribution. For ordinary shipping apps, browser-hosted JIT remains a real workaround. For special deployment setups, it is not the only option.

    Be precise when you plan around Apple platform limits. Separate App Store constraints from sideloaded or entitled builds before you choose a runtime strategy.

      Attribution:
    • bawolff #1
    • saagarjha #1
    • jrmg #1
  4. 04

    Browser choice still sets the ceiling

    The roughly 25 percent gap between Firefox and Chrome or Safari in the posted results matters because it shows where the remaining bottleneck can move. Once your own runtime is reasonably smart, the browser’s WebAssembly compiler and surrounding engine become part of your performance budget. That means a successful optimization strategy can still land very differently across engines even when your generated code is identical.

    If you ship WebAssembly-heavy code, test every browser you claim to support and tune your expectations by engine. Your runtime may stop being the main source of variance.

      Attribution:
    • dag100 #1

Against the grain

  1. 01

    The motivation overstates iOS limits

    The sharpest criticism was aimed at the premise, not the implementation. The post presents iOS as blocking this class of optimization outright, but commenters pointed out that this is only fully true for regular App Store apps. Tools like StikDebug and the mention of get-task-allow show there are deployment modes where dynamic code execution is possible. That narrows the novelty claim. The hack is clever, but it is a workaround for one important channel, not proof that Apple leaves no other path.

    When you tell the story of a systems project, scope the platform constraint carefully. Overclaiming the restriction invites people to discount the genuinely clever engineering.

      Attribution:
    • jhatemyjob #1
    • saagarjha #1
    • jrmg #1
  2. 02

    A native emulator can still be much faster

    One commenter pushed back on the implied performance headline by noting that well-written native emulators have run faster on decades-old PCs than this design does on modern hardware. That does not undercut the project’s main accomplishment, but it does reset expectations. The comparison that flatters WATaBoy is against its own native interpreter, not against the best native emulation work.

    Use the result as evidence that JIT-on-WASM is viable, not that it is state of the art for emulation speed. If absolute performance is your goal, benchmark against mature native emulators.

      Attribution:
    • lightedman #1

In plain english

ahead-of-time
Compiled before execution rather than during execution.
entitlement
A signed permission in Apple platforms that grants an app access to restricted capabilities.
get-task-allow
An Apple entitlement that permits debugging-related capabilities and changes what a signed app is allowed to do on iOS.
JIT
Just-in-time compilation, where code is translated into faster machine code while the program is running instead of ahead of time.
LLVM IR
LLVM intermediate representation, a low-level code form used inside LLVM before code is turned into machine instructions.
WASM
WebAssembly, a compact low-level code format designed to run efficiently in web browsers and other runtimes.

Reference links

Related emulator compilation work

  • Jamulator
    An earlier writeup on statically recompiling NES code that commenters used to frame why runtime recompilation is more practical.

iOS JIT workaround references

  • StikDebug
    Raised as a counterexample showing that some iOS deployment setups can support dynamic code workflows.

Browser engine performance examples

  • Firefox Bug 715181
    Referenced as an example of self-hosted JavaScript beating a native C++ implementation inside a browser engine.