Comment by lewq

Comment by lewq 10 hours ago

2 replies

Hi, author of the post here. Just fixed up some formatting issues from when we copied it into substack, sorry about that. Yeah, I used Opus 4.5 to help me write it (and it actually made me laugh!). But the struggle was real. Something I didn't make clear enough in the post is that jpeg works because each screenshot is taken exactly when it's requested. Whereas streaming video is pushing a certain frame rate. The client driving the frame rate is exactly what makes it not queue frames. Yes, I wish we could UDP in enterprise networks too, but we can't. The problem actually isn't opening the UDP port, it's hosting UDP on their Kubernetes cluster. "You want to what?? We have ingress. For HTTPS"

Join our discord for private beta in January! https://discord.gg/VJftd844GE

(This post written by human)

kixelated 7 hours ago

Hey lewq, 40Mbps is an absolutely ridiculous bitrate. For context, Twitch maxes out around 8.5Mb/s for 1440p60. Your encoder was poorly configured, that's it. Also, it sounds like your mostly static content would greatly benefit from VBR; you could get the bitrate down to 1Mb/s or something for screen sharing.

And yeah, the usual approach is to adapt your bitrate to network conditions, but it's also common to modify the frame rate. There's actually no requirement for a fixed frame rate with video codecs. It also you could do the same "encode on demand" approach with a codec like H.264, provided you're okay with it being low FPS on high RTT connections (poor Australians).

Overall, using keyframes only is a very bad idea. It's how the low quality animated GIFs used to work before they were secretly replaced with video files. Video codecs are extremely efficient because of delta encoding.

But I totally agree with ditching WebRTC. WebSockets + WebCodecs is fine provided you have a plan for bufferbloat (ex. adaptive bitrate, ABR, GoP skipping).

Dylan16807 8 hours ago

> Something I didn't make clear enough in the post is that jpeg works because each screenshot is taken exactly when it's requested. Whereas streaming video is pushing a certain frame rate. The client driving the frame rate is exactly what makes it not queue frames.

I understand that logic but I don't really agree with it. Very aggressive bitrate controls can do a lot to keep that buffer tiny while still looking better than JPEG, and if it bloats beyond 1-2 seconds you can reset. A reset like that wouldn't look notably worse than JPEG mode always looks.

If you use a video encoder that gives you good insight into what it's doing you could guarantee that the buffer never gets bigger than 1-2 JPEGs by dynamically deciding when to add frames. That would give you the huge benefits of P-frames with no downside.