Comment by HelloUsername
Comment by HelloUsername 18 hours ago
> Try 1Mbps and iterate from there.
From the article:
“Just lower the bitrate,” you say. Great idea. Now it’s 10Mbps of blocky garbage that’s still 30 seconds behind.
Comment by HelloUsername 18 hours ago
> Try 1Mbps and iterate from there.
From the article:
“Just lower the bitrate,” you say. Great idea. Now it’s 10Mbps of blocky garbage that’s still 30 seconds behind.
Yeah, i think the author has been caught out by the fact that there simply isn’t a canonical way to encode h264.
JPEG is nice and simple, most encoders will produce (more or less) the same result for any given quality settings. The standard tells you exactly how to compress the image. Some encoders (like mozjpeg) use a few non-standard tricks to produce 5-20% better compression, but it’s essentially just a clever lossy preprocessing pass.
With h264, the standard essentially just says how decompressors should work, and it’s up to the individual encoders to work out to make best use of the available functionality for their intended use case. I’m not sure any encoder uses the full functionality (x264 refuses to use arbitrary frame order without b-frames, and I haven’t found an encoder that takes advantage of that). Which means the output of different encoders has wildly different results.
I’m guessing moonlight makes the assumption that most of its compression will come from motion prediction, and then takes massive shortcuts when encoding iframes.
Rejecting it out of hand isn't actually trying it.
10Mbps is still way too high of a minimum. It's more than YouTube uses for full motion 4k.
And it would not be blocky garbage, it would still look a lot better than JPEG.
1Mbps for video is rule of thumb I use. Of course that will depend on customer expectations. 500K can work, but it won’t be pretty.
For normal video I think that's a good rule of thumb.
For mostly-static content at 4fps you can cut a bunch more bitrate corners before it looks bad. (And 2-3 JPEGs per second won't even look good at 1Mbps.)
>> 10Mbps is still way too high of a minimum. It's more than YouTube uses for full motion 4k.
> And 2-3 JPEGs per second won't even look good at 1Mbps.
Unqualified claims like these are utterly meaningless. It depends too much on exactly what you're doing, some sorts of images will compress much better than others.
Youtube 4k uses VP9 and AV1 codecs that are multiple generations ahead of H.264
Proper rate control for such realtime streaming would also lower framerate and/or resolution to maintain the best quality and latency they can over dynamic network conditions and however little bandwidth they have. The fundamental issue is that they don't have this control loop at all, and are badly simulating it by polling JPEGs.
10Mbits is more than the maximum ingest bitrate allowed on Twitch. Granted, anyone who watches a recent game or an IRL stream there might tell you that it should go up to 12 or 15, but I don't think an LLM interface should have trouble. This feels like someone on a 4K monitor defeating themselves through their hedonic treadmill.
The problem is I think that they are using moonlight which is "designed" to stream games at very low latency. I very much doubt that people need <30ms response times watching an agent terminal or whatever they are showing!
When you try and use h264 et al at low latency you have to get rid of a lot of optimisations to encode it as quickly as possible. I also highly suspect the vaapi encoder is not very good esp at low bitrates.
I _think_ moonlight also forces CBR instead of VBR, which is pretty awful for this use case - imagine you have 9 seconds of 'nothing changing' and then the window moves for 0.25 seconds. If you had VBR the encoder could basically send ~0kbit/sec apart from control metadata, and then spike the bitrate up when the window moved (for brevity I'm simplifying here, it's more complicated than this but hopefully you get the idea).
Basically they've used the wrong software entirely. They should try and look at xrdp with x264 as a start.