Comment by avianlyric

Comment by avianlyric a day ago

5 replies

You’ve kind of answered your own question here.

> If you push that conditional out to the caller, now every caller of your function has to individually make sure they call it in the right way to get a guarantee of idempotency

In this situation your function is no longer idempotent, so you obviously can’t provide the guarantee. But quite frankly, if you’re having to resort to making individual functions implement state management to provide idempotency, then I suspect you’re doing something very odd, and have way too much logic happening inside a single function.

Idempotent code tends to fall into two distinct camps:

1. Code that’s inherently idempotent because the data model and operations being performed are inherently idempotent. I.e. your either performing stateless operations, or you’re performing “PUT” style operations where in the input data contains all the state the needs to be written.

2. Code that’s performing more complex business operations where you’re creating an idempotent abstraction by either performing rollbacks, or providing some kind of atomic apply abstraction that ensures partial failures don’t result in corrupted state.

For point 1, you shouldn’t be checking for order of operations, because it doesn’t matter. Everything is inherently idempotent, just perform the operations again.

For point 2, there is no simple abstraction you can apply. You need to have something record the desired operation, then ensure it either completes or fails. And once that happens, ensures that completion or failure is persistent permanently. But that kind of logic is not the kind of thing you put into a function and compose with other operations.

shawnz a day ago

Consider a simple example where you're checking if a file exists, or a database object exists, and creating it if not. Imagine your filesystem or database library either doesn't have an upsert function to do this for you, or else you can't use it because you want some special behaviour for new records (like writing the current timestamp or a running total, or adding an entry to a log file, or something). I think this is a simple, common example where you would want to combine a conditional with an action. I don't think it's very "odd" or indicative of "way too much logic".

  • avianlyric a day ago

    > a database object exists, and creating it if not. Imagine your filesystem or database library either doesn't have an upsert function to do this for you, or else you can't use it because you want some special behaviour for new records (like writing the current timestamp or a running total, or adding an entry to a log file, or something).

    This is why databases have transactions.

    > simple example where you're checking if a file exists

    Personally I avoid interacting directly with the filesystem like the plague due to issues exactly like this. Working with a filesystem correctly is way harder than people think it is, and handling all the edge-cases is unbelievably difficult. If I'm building a production system where correctness is important, then I use abstractions like databases to make sure I don't have to deal with filesystem nuances myself.

    • shawnz a day ago

      Sure, I agree that a transaction should be used here (in the database example at least). But that's orthogonal to my point, or maybe even in favour of it: doesn't a transaction necessitate keeping the conditional close to the effect? It's a perfect example of what I'm trying to say, how do you make sure the conditional happens in the same transaction as the effect, while simultaneously trying to push the conditional towards the root of the code and away from the effect? Transaction boundaries are exactly the kind of thing that makes pushing up the conditionals difficult.

      • avianlyric 13 hours ago

        By pushing up the transaction boundary. The only reason why the conditional is important is because it part of a larger sequence of operations that you want to complete in an atomic fashion.

        Your transaction needs to encompass all of those operations, not just parts of it.

jknoepfler a day ago

Probably implicit in your #2, but there are two types of people in the world: people who know why you shouldn't try to write a production-grade database from scratch, and people who don't know why you shouldn't try to write a production-grade database from scratch. Neither group should try to write a production-grade database from scratch.