Comment by martinald
The problem is I think that they are using moonlight which is "designed" to stream games at very low latency. I very much doubt that people need <30ms response times watching an agent terminal or whatever they are showing!
When you try and use h264 et al at low latency you have to get rid of a lot of optimisations to encode it as quickly as possible. I also highly suspect the vaapi encoder is not very good esp at low bitrates.
I _think_ moonlight also forces CBR instead of VBR, which is pretty awful for this use case - imagine you have 9 seconds of 'nothing changing' and then the window moves for 0.25 seconds. If you had VBR the encoder could basically send ~0kbit/sec apart from control metadata, and then spike the bitrate up when the window moved (for brevity I'm simplifying here, it's more complicated than this but hopefully you get the idea).
Basically they've used the wrong software entirely. They should try and look at xrdp with x264 as a start.
Yeah, i think the author has been caught out by the fact that there simply isn’t a canonical way to encode h264.
JPEG is nice and simple, most encoders will produce (more or less) the same result for any given quality settings. The standard tells you exactly how to compress the image. Some encoders (like mozjpeg) use a few non-standard tricks to produce 5-20% better compression, but it’s essentially just a clever lossy preprocessing pass.
With h264, the standard essentially just says how decompressors should work, and it’s up to the individual encoders to work out to make best use of the available functionality for their intended use case. I’m not sure any encoder uses the full functionality (x264 refuses to use arbitrary frame order without b-frames, and I haven’t found an encoder that takes advantage of that). Which means the output of different encoders has wildly different results.
I’m guessing moonlight makes the assumption that most of its compression will come from motion prediction, and then takes massive shortcuts when encoding iframes.