Comment by samiv

Comment by samiv 6 months ago

7 replies

Not to mention

  - the incredible overhead of each and every API call
  - the nerfed timers that jitter on purpose
  - the limitation of a single rendering context and that you *must* use the JS main thread to all those rendering calls (so no background async for you..)
dakom 6 months ago

> overhead of each API call

Yeah, that's an issue, esp with WebGL.. but you can get pretty far by reducing calls with a cache, things like "don't set the uniform / attribute if you don't need to".. but I hear WebGPU has a better API for this, and eventually this should get native performance.. though, I also wonder, is this really a bottleneck for real-world projects? I love geeking out about this.. but.. I suspect the real-world blocker is more like "user doesn't want to wait 5 mins to download AAA textures"

> Nerfed timers

Yeah, also an issue. Fwiw Mainloop.js gives a nice API for having a fixed timestep and getting an interpolation value in your draw handler to smooth things out. Not perfect, but easy and state-of-the-art afaict. Here's a simple demo (notice how `lerp` is called in the draw handler): https://github.com/dakom/mainloop-test

Re: multithreading, I don't think that's a showstopper... more like, techniques you'd use for native aren't going to work out of the box on web, needs more custom planning. I see this as more of a problem for speeding up systems _within_ systems, i.e. faster physics by parallelizing grids or whatever, but for having a physics WASM running in a worker thread that shares data with the main thread, it's totally doable, just needs elbow grease to make it work (will be nice when multithreading _just works_ with a a SharedArrayBuffer and easily)

  • samiv 6 months ago

    Multithreading yes that works the way you mention but I meant multiple rendering contexts.

    In standard OpenGL the de-facto way to do parallel GPU resource uploads while rendering is to have multiple rendering contexts in a "share group" which allows them to share some resources such as textures. So then you can run rendering in one thread that uses one context and do resource uploads in another thread that uses a different context.

    There was a sibling comment that mentioned something called off screen canvas which hints that it might be something that would let the web app achieve the same.

flohofwoe 6 months ago

> - the incredible overhead of each and every API call

The calling overhead between WASM and JS is pretty much negligible since at least 2018:

https://hacks.mozilla.org/2018/10/calls-between-javascript-a...

> - - the nerfed timers that jitter on purpose

At least Chrome and Firefox have "high-enough" resolution timers in cross-origin-isolated contexts:

https://developer.chrome.com/blog/cross-origin-isolated-hr-t...

...also, if you just need a non-jittery frame time, computing the average over multiple frames actually gives you a frame duration that's stable and exact (e.g. 16.667 or 8.333 milliseconds despite the low-resolution inputs).

Also, surpise: there are no non-jittery time sources on native platforms either (for measuring frame duration at least) - you also need to run a noise-removal filter over the measured frame duration in native games. Even the 'exact' presentation timestamps from DXGI or MTLDrawable have very significant (up to millisecond) jitter.

> - the limitation of a single rendering context and that you must use the JS main thread to all those rendering calls (so no background async for you..)

OffscreenCanvas allows to perform rendering in a worker thread: https://web.dev/articles/offscreen-canvas

  • samiv 6 months ago

    I didn't mean just WASM -> JS but the WebGL API call overhead which includes marshalling the call from WASM runtime across multiple layers and processes inside the browser.

    Win32 performance counter has native resolution < 1us

    OffScreencanvas is something I haven't actually come across before. Looks interesting, but I already expect that the API is either brain damaged or intentionally nerfed for security reasons (or both). Anyway I'll look into it so thanks for that!

    • flohofwoe 6 months ago

      > Win32 performance counter has native resolution < 1us

      Yes but that's hardly useful for things like measuring frame duration when the OS scheduler runs your per-frame code a millisecond late or early, or generally preempts your thread in the middle of your timing code (eg measuring durations precisely is also a non-trivial problem on native platforms even with high precision time sources).

      • MindSpunk 6 months ago

        Almost every game for the last 25 years has used those Win32 performance counters, or the platforms nearest equivalent (it’s a just a wrapper over some CPU instructions), to measure frame times. It’s the highest resolution clock in the system, it’s a performance counter. You’re supposed to use it for that.

        If you want to correlate the timestamps with wall time then good luck, but if you just need to know how many nanoseconds elapsed between two points in the program on a single thread then that’s your tool.

        • flohofwoe 6 months ago

          Almost all games also had subtle microstutter until some Croteam peeps actually looked into it and figured out what's wrong with naive frame timing on modern operating systems: https://medium.com/@alen.ladavac/the-elusive-frame-timing-16...)

          TL;DR: the precision of your time source won't matter much since thread scheduling gets in the way, one general solution is to apply some sort of noise filter to remove jitter