Comment by pzmarzly

Comment by pzmarzly 2 days ago

10 replies

> With Protobuf, that’s impossible.

Unless your servers and clients push at different time, thus are compiled with different versions of your specs, then many safety bets are off.

There are ways to be mostly safe (never reuse IDs, use unknown-field-friendly copying methods, etc.), but distributed systems are distributed systems, and protobuf isn't a silver bullet that can solve all problems on author's list.

On the upside, it seems like protobuf3 fixed a lot of stuff I used to hate about protobuf2. Issues like:

> if the field is not a message, it has two states:

> - ...

> - the field is set to the default (zero) value. It will not be serialized to the wire. In fact, you cannot determine whether the default (zero) value was set or parsed from the wire or not provided at all

are now gone if you stick to using protobuf3 + `message` keyword. That's really cool.

connicpu 2 days ago

Regardless of whether you use JSON or Protobuf, the only way to be safe from version tears in your serialization format is to enforce backwards compatibility in your CI pipeline by testing the new version of your service creates responses that are usable by older versions of your clients, and vice versa.

  • sevensor a day ago

    Yeah, no discussion of this topic is complete without bringing up schema evolution. There’s a school of thought that holds this is basically impossible and the right thing to do is never ever make a breaking change. Instead, allow new fields to be absent and accept unrecognized fields always. I think this is unsustainable and hard to reason about.

    I have not found it difficult to support backwards compatibility with explicit versioning, and the only justification I’ve seen for not doing it is that it’s impossible to coordinate between development teams. Which I think is an indictment of how the company is run more than anything else.

brabel 2 days ago

No type system survives going through a network.

  • dhussoe 2 days ago

    yes, but any sane JSON parsing library (Rust Serde, kotlinx-serialization, Swift, etc.) will raise an error when you have the wrong type or are missing a required field. and any JSON parsing callsite is very likely also an IO callsite so you need to handle errors there anyways, all IO can fail. then you log it or recover or whatever you do when IO fails in some other way in that situation.

    this seems like a problem only if you use JSON.parse or json.loads etc. and then just cross your fingers and hope that the types are correct, basically doing the silent equivalent of casting an "any" type to some structure that you assume is correct, rather than strictly parsing (parse, don't validate) into a typed structure before handing that off to other code.

    • koakuma-chan 2 days ago

      > strictly parsing (parse, don't validate)

      That's called validating? Zod is a validation library.

      But yeah, people really need to start strictly parsing/validating their data. One time I had an interview and I was told yOu DoN'T tRuSt YoUr BaCkeNd?!?!?!?

      • dhussoe 2 days ago

        "parse don't validate" is from: https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

        looking at zod (assuming https://zod.dev) it is a parsing library by that definition — which isn't, like, an official definition or anything, one person on the internet came up with it, but I think it is good at getting the principle across

        under these definitions a "parser" takes some input and returns either some valid output (generally a more specific type, like String -> URL) or an error, whereas a "validator" just takes that input and returns a boolean or throws an error or whatever makes sense in the language.

        eta: probably part of the distinction here is that since zod is a JS library the actual implementation can be a "validator" and then the original parsed JSON input can just be returned with a different type. "parse don't validate" is (IMO) more popular in languages like Rust where you would already need to parse the JSON to a language-native structure from the original bytes, or to some "JSON" type like https://docs.rs/serde_json/latest/serde_json/enum.Value.html that are generally awkward for application code (nudging you onto the happy parsing path).