Comment by pornel

It depends how you actually use the messages. Zero-copy can be slowing things down. Copying within L1 cache is ~free, but operating on needlessly dynamic or suboptimal data structures can add overheads everywhere they're used.

To actually literally avoid any copying, you'd have to directly use the messages in their on-the-wire format as your in-memory data representation. If you have to read them many times, the extra cost of dynamic getters can add up (the format may cost you extra pointer chasing, unnecessary dynamic offsets, redundant validation checks and conditional fallbacks for defaults, even if the wire format is relatively static and uncompressed). It can also be limiting, especially if you need to mutate variable-length data (it's easy to serialize when only appending).

In practice, you'll probably copy data once from your preferred in-memory data structures to the messages when constructing them. When you need to read messages multiple times at the receiving end, or merge with some other data, you'll probably copy them into dedicated native data structs too.

If you change the problem from zero-copy to one-copy, it opens up many other possibilities for optimization of (de)serialization, and doesn't keep your program tightly coupled to the serialization framework.