Comment by samiv

Comment by samiv 4 hours ago

5 replies

Not to mention

  - the incredible overhead of each and every API call
  - the nerfed timers that jitter on purpose
  - the limitation of a single rendering context and that you *must* use the JS main thread to all those rendering calls (so no background async for you..)
dakom an hour ago

> overhead of each API call

Yeah, that's an issue, esp with WebGL.. but you can get pretty far by reducing calls with a cache, things like "don't set the uniform / attribute if you don't need to".. but I hear WebGPU has a better API for this, and eventually this should get native performance.. though, I also wonder, is this really a bottleneck for real-world projects? I love geeking out about this.. but.. I suspect the real-world blocker is more like "user doesn't want to wait 5 mins to download AAA textures"

> Nerfed timers

Yeah, also an issue. Fwiw Mainloop.js gives a nice API for having a fixed timestep and getting an interpolation value in your draw handler to smooth things out. Not perfect, but easy and state-of-the-art afaict. Here's a simple demo (notice how `lerp` is called in the draw handler): https://github.com/dakom/mainloop-test

Re: multithreading, I don't think that's a showstopper... more like, techniques you'd use for native aren't going to work out of the box on web, needs more custom planning. I see this as more of a problem for speeding up systems _within_ systems, i.e. faster physics by parallelizing grids or whatever, but for having a physics WASM running in a worker thread that shares data with the main thread, it's totally doable, just needs elbow grease to make it work (will be nice when multithreading _just works_ with a a SharedArrayBuffer and easily)

  • samiv 24 minutes ago

    Multithreading yes that works the way you mention but I meant multiple rendering contexts.

    In standard OpenGL the de-facto way to do parallel GPU resource uploads while rendering is to have multiple rendering contexts in a "share group" which allows them to share some resources such as textures. So then you can run rendering in one thread that uses one context and do resource uploads in another thread that uses a different context.

    There was a sibling comment that mentioned something called off screen canvas which hints that it might be something that would let the web app achieve the same.

flohofwoe 4 hours ago

> - the incredible overhead of each and every API call

The calling overhead between WASM and JS is pretty much negligible since at least 2018:

https://hacks.mozilla.org/2018/10/calls-between-javascript-a...

> - - the nerfed timers that jitter on purpose

At least Chrome and Firefox have "high-enough" resolution timers in cross-origin-isolated contexts:

https://developer.chrome.com/blog/cross-origin-isolated-hr-t...

...also, if you just need a non-jittery frame time, computing the average over multiple frames actually gives you a frame duration that's stable and exact (e.g. 16.667 or 8.333 milliseconds despite the low-resolution inputs).

Also, surpise: there are no non-jittery time sources on native platforms either (for measuring frame duration at least) - you also need to run a noise-removal filter over the measured frame duration in native games. Even the 'exact' presentation timestamps from DXGI or MTLDrawable have very significant (up to millisecond) jitter.

> - the limitation of a single rendering context and that you must use the JS main thread to all those rendering calls (so no background async for you..)

OffscreenCanvas allows to perform rendering in a worker thread: https://web.dev/articles/offscreen-canvas

  • samiv 3 hours ago

    I didn't mean just WASM -> JS but the WebGL API call overhead which includes marshalling the call from WASM runtime across multiple layers and processes inside the browser.

    Win32 performance counter has native resolution < 1us

    OffScreencanvas is something I haven't actually come across before. Looks interesting, but I already expect that the API is either brain damaged or intentionally nerfed for security reasons (or both). Anyway I'll look into it so thanks for that!

    • flohofwoe 29 minutes ago

      > Win32 performance counter has native resolution < 1us

      Yes but that's hardly useful for things like measuring frame duration when the OS scheduler runs your per-frame code a millisecond late or early, or generally preempts your thread in the middle of your timing code (eg measuring time precisely is also a non-trivial problem on native platforms)