WASM GC isn't ready for realtime graphics
(dthompson.us)102 points by todsacerdoti 13 hours ago
102 points by todsacerdoti 13 hours ago
Wasn't WASM GC a prerequisite for getting direct DOM access from WASM? Does progress for WASM GC mean progress for DOM access as well?
Every time I check back on that the initiative seems to run under a different name. What is the best way to track progress on that front?
It’s not a prerequisite for using the DOM from wasm.
See, for example, the rust web frameworks of leptos and dioxus. They’re honestly great, and usable today as replacements for react and friends. (With the single caveat that wasm bundle size is a bit bigger than .js size).
They work by exposing a number of browser methods through to wasm, and then they call them through a custom wasm/JS API bridge. All rust objects and DOM objects are completely isolated. Rust objects are allocated via an embedded malloc implementation and JS objects are managed by V8 (or whatever). but the DOM can still be manipulated via (essentially) message passing over an RPC like interface.
But the rust code needs to compile malloc specially for wasm. This is ok in rust - malloc is 75kb or something. But in languages like C#, Go or Python, the runtime GC is much bigger and harder to fit in a little wasm bundle.
The upside of wasm-gc is that this divide goes away. Objects are just objects, shared between both languages. So wasm bundles can use & reference JS/DOM objects directly. And wasm programs can piggyback on V8’s GC without needing to ship their own. This is good in rust, and great in GC languages. I saw an example with blazor where a simple C# wasm todo app went from 2mb or something to 10kb when wasmgc was used.
TLDR: wasm-gc isn’t strictly needed. You can use DOM from wasm today. It just makes wasm bundles smaller and wasm-dom interaction easier (and theoretically faster).
Really liked NaCl (and PNaCl) idea, which allows running arbitrary code, sanitized, with ~90% speed of native execution. Playing Bastion game in browser was refreshing. Unfortunately communication with js code and bootstrap issues (can't run code without plugin, no one except chrome supported this) ruined that tech
Same here, and the irony is Mozzilla opposing it hardly matters nowadays for the Firefox browser market, it is Google driving where WebAssembly goes.
Remember NaCL, and PNaCL SDKs, came with support for C, C++ and OCaml, the latter being an example for GC languages.
WASM nowadays has become quite the monstrosity compared to NaCl/PNaCl. Just look at this WASM GC spaghetti, trying to compile a GC'd language but hooking it up V8/JavaScriptCore's GC, while upholding a strict security model... That sounds like it won't cause any problems whatsoever!
Sometimes I wonder if the industry would have been better off with NaCl as a standard. Old, mature tooling would by and large still be applicable (it's still your ordinary x86/ARM machine code) instead of the nascent and buggy ecosystem we have now. I don't know why, but the JS folks just keep reinventing everything all the time.
> Old, mature tooling would by and large still be applicable (it's still your ordinary x86/ARM machine code)
It wasn't, though. Since NaCl ran code in the same process as the renderer, it depended upon a verifier for security, and required the generated code to follow some unusual constraints to support that verification. For example, on x86, all branch targets were required to be 32-byte aligned, and all indirect branches were required to use a specific instruction sequence to enforce that alignment. Generating code to meet these constraints required a modified compiler, and reduced code density and speed.
In any case, NaCl would have run into the exact same GC issues if it had been used more extensively. The only reason it didn't was that most of the applications it saw were games which barely interacted with the JS/DOM "world".
I simplified in my comment. It was a much better story for tooling, since you could reuse large parts of existing backends/codegen, optimization passes, and debugging. The mental model of execution would remain too, rather than being a weird machine code for a weird codesize-optimized stack machine.
I would wager the performance implications of NaCl code, even for ARM which required many more workarounds than x86 (whose NaCl impl has a "one weird trick" aura), were much better than for modern WASM.
It's hard to say if it would've run into the same issues. For one, it would've been easier to port native GCs: they don't run afoul of W^X rules, they just read memory if that, which you can do performantly in NaCl on x86 due to the segments trick. I also suspect the culture could've more easily evolved towards shared objects where you would be able to download/parse/verify a stdlib once, and then keep using it.
I agree it was because the applications were games, but for another second-order reason: they were by and large C/C++ codebases where memory was refcounted manually. Java was probably the second choice, but those were the days when Java applets were still auto-loading, so there was likely no need for anybody to try.
> It's hard to say if it would've run into the same issues. For one, it would've been easier to port native GCs...
WASM GC isn't just about memory management for the WASM world; it's about managing references (including cyclical refs!) which cross the boundary into the non-WASM world. Being able to write a GC within the WASM (or NaCl) world doesn't get you that functionality.
I'm reminded of writing JavaScript way back in the old Internet Explorer days (6 and to a lesser extent 7), when you had to manually null out any references to DOM elements if you were done with them, or else the JS and the DOM nodes wouldn't get garbage collected because IE had two different garbage collectors and cycles between them didn't get collected immediately.
I was excited to read this post because I haven't yet tried WasmGC for anything beyond tiny toy examples, but was disappointed to find no actual numbers for performance. I don't know the author well enough to be able to assess their assertions that various things are "slow" without data.
It's sort of baffled me that people appear to be shipping real code using WasmGC since the limitations described in this post are so severe. Maybe it's fine because they're just manipulating DOM nodes? Every time I've looked at WasmGC I've gone "there's no way I could use this yet" and decided to check back a year later and see if it's There Yet.
Hopefully it gets there. The uint8array example from this post was actually a surprise to me, I'd just assumed it would be efficient to access a typed array via WasmGC!
Beyond the limitations in this post there are other things needed to be able to target WasmGC with existing stuff written in other languages, like interior references or dependent handles. But that's okay, I think, it can be worthwhile for it to exist as-is even if it can't support i.e. existing large-scale apps in memory safe languages. It's a little frustrating though.
>> The uint8array example from this post was actually a surprise to me, I'd just assumed it would be efficient to access a typed array via WasmGC!
The problem is that the Scheme i8 array is not actually a UInt8Array with WasmGC. It’s a separate heap allocated object that is opaque to the JS runtime.
In the linear memory Wasm model, the Scheme i8 array is allocated in the wasm memory array, and so one can create an UInt8Array view that exactly maps to the same bytes in the linear memory buffer. This isn’t possible (yet?) with the opaque WasmGC object type.
Definitely a lot is missing, yeah, and adding more will take time. But it works well already for pure computational code. For example, Google Sheets uses WasmGC for Java logic:
https://web.dev/case-studies/google-sheets-wasmgc#the_final_...
I've been shipping a Flutter app that uses it for months. Pretty heavy stuff, its doing everything from LLM inference to model inference to maintaining a vector store and indexeddb in your browser.
Frame latency feels like it's gone, there's 100% a significant decrease in perceived latency.
I did have a frustrating performance issues with 3rd party code doing "source code parsing" via RegEx, thought it was either the library or Flutters fault, but from the article content, sounds like it was WASM GC. (saw a ton of time spent converting objects from JS<->WASM on a 50 KLOC file)
From that perspective, the article sounds a bit maximalist in its claims, but only from my perspective.
I think if you read "real time graphics" as "3d game" it gives a better understanding of where it's at, my anecdata aside.
Don't wanna name names, because it's on me, it's a miracle it exists, and works.
I don't think there's a significant # of alternatives, so hopefully Flutter syntax highlighting library, as used in a package for making markdown columns, is enough to be helpful.
Problem was some weird combo of lots of regex and an absolutely huge amount of code. It's one of those problems it's hard for me to draw many conclusions from:
- Flutter may be using browser APIs for regex, so there's some sort of JS/WASM barrier copying cost
- The markdown column renderer is doing nothing at all to handle this situation lazily, i.e. if any portion of the column is displayed, syntax highlighting must be done on the complete markdown input
- Each different color text, so pretty much every word, gets its own object in the view hierarchy, tens if not hundreds of thousands this case. Can't remember if this is due to the syntax highlighting library or the markdown package
- Regex is used to parse to code and for all I know one of them has pathological performance like backtracking unintentionally.
so what about realtime graphics with wasm without GC? (compiled from languages not needing a GC like Rust, C/C++, Odin, ...)
Better, but WebGPU and WebGL aren't going to win any performance prizes either, and tooling is pretty much non existent.
Nothing like Pix, Instruments or Renderdoc, SpectorJS is the only thing you get after almost 15 years since WebGL 1.0.
And from the hardware level they support, it about PlayStation 3 kind of graphics, if the browser doesn't block the GPU, nor selects the integrated one instead of dedicated one.
Your are left with shaders as the only way to actually push the hardware.
> Unsatisfying workarounds [...] Use linear memory for bytevectors
It never makes sense to use GC for leaf memory if you're in a language that offers both, since mere refcounting (or a GC'ed object containing a unique pointer) is trivial to implement.
There are a lot of languages where it's expensive to make the mistake this post is making. (I don't know much about WASM in particular; it may still have other errors).
To be fair neither are WebGL and WebGPU, versus the native API counterparts, the best you can get are shadertoy demos, and product visualisation on ecommerce sites.
Due to tooling, sandboxing and not having any control about what GPU gets selected, or why the browser blakckboxes it and switches into software rendering.