Comment by skybrian

Comment by skybrian 11 hours ago

7 replies

A fundamental limitation is that static analysis enforces guarantees when a compiler can see and understand all the code at the same time.

It's great when you compile large amounts of code written in the same language into a single binary.

It's not so great for calls between languages or for servers communicating across a network. To get static guarantees back again, you often need to validate your inputs, for example using something like Zod for TypeScript. And then it's not a static guarantee anymore; it's a runtime error.

Database tables often live on a different server than the server-side processes that access them, so mix-and-match between different schema versions that the compiler never saw together is possible in most systems.

To prevent this, you would need some kind of monolithic release process. That runs into lifecycle issues, since data is often much longer-lived than the code that accesses it.

tines 9 hours ago

> you often need to validate your inputs, for example using something like Zod for TypeScript. And then it's not a static guarantee anymore; it's a runtime error.

True, but validating at the boundaries and having a safe core is much better than having the unsafe portion everywhere imo.

  • skybrian 7 hours ago

    It depends on the system. Some servers just don't do very much. If a server validates the input and then just sends it on without doing any calculations, there's very little to go wrong that static analysis can warn you about.

    And then the next server in line has to validate the data again.

    • alpinisme 6 hours ago

      Most languages have a way to represent a blob of bytes that you don’t care about the internal shape or meaning of. The point of parsing is to validate the stuff that you want to use. To use the Zod example from up-thread, you can use z.unknown() or use z.looseObject() if you care about some keys but not others (while wanting to propagate the whole object).

      • skybrian an hour ago

        Yep. Although, sometimes checking the data is the point. It will depend on whether you want to catch errors early.

IshKebab 9 hours ago

Yes you have to validate untrusted input at runtime. Seems a bit odd to call that a fundamental limitation of static types.

  • skybrian 7 hours ago

    Another way to put it is that in some programs, there's relatively little for static analysis to do because the program isn't doing much internal calculation, but there are a lot of potential I/O errors that it won't be able to catch. Programming embedded devices is often like that.

bitwize 5 hours ago

Parse, don't validate. Alexis King's article by that title should be fundamental reading for all distributed application developers. Go directly from on-the-wire representation to data structure with a well-defined type, and error if you can't. Use the type system to describe all representable objects. While this results in runtime errors on parse failure on the receiver's side, an advantage is that if the sender uses those same type definitions, you can guarantee that it will always send valid data.

Schemas should be derived from the type of the representation used in the application code, or vice-versa. If there are multiple schemas in use, there should be multiple corresponding types, perhaps with ways to interconvert between them. Schema changes will be less frequent, and easier to manage, with a bit of upfront systems analysis and design, including use of a data dictionary or complete IRM repository.

The more time you invest in thinking and planning what you want to build ahead of time, the fewer bugs and changes you'll have to worry about at runtime. The payoff is almost always more than worth the investment; "we don't have time to do things the right way" implies you have plenty of time to do things the wrong way.