Comment by shawnz

Comment by shawnz a day ago

13 replies

Sometimes I like to put the conditional logic in the callee because it prevents the caller from doing things in the wrong order by accident.

Like for example, if you want to make an idempotent operation, you might first check if the thing has been done already and if not, then do it.

If you push that conditional out to the caller, now every caller of your function has to individually make sure they call it in the right way to get a guarantee of idempotency and you can't abstract that guarantee for them. How do you deal with that kind of thing when applying this philosophy?

Another example might be if you want to execute a sequence of checks before doing an operation within a database transaction. How do you apply this philosophy while keeping the checks within the transaction boundary?

avianlyric a day ago

You’ve kind of answered your own question here.

> If you push that conditional out to the caller, now every caller of your function has to individually make sure they call it in the right way to get a guarantee of idempotency

In this situation your function is no longer idempotent, so you obviously can’t provide the guarantee. But quite frankly, if you’re having to resort to making individual functions implement state management to provide idempotency, then I suspect you’re doing something very odd, and have way too much logic happening inside a single function.

Idempotent code tends to fall into two distinct camps:

1. Code that’s inherently idempotent because the data model and operations being performed are inherently idempotent. I.e. your either performing stateless operations, or you’re performing “PUT” style operations where in the input data contains all the state the needs to be written.

2. Code that’s performing more complex business operations where you’re creating an idempotent abstraction by either performing rollbacks, or providing some kind of atomic apply abstraction that ensures partial failures don’t result in corrupted state.

For point 1, you shouldn’t be checking for order of operations, because it doesn’t matter. Everything is inherently idempotent, just perform the operations again.

For point 2, there is no simple abstraction you can apply. You need to have something record the desired operation, then ensure it either completes or fails. And once that happens, ensures that completion or failure is persistent permanently. But that kind of logic is not the kind of thing you put into a function and compose with other operations.

  • shawnz a day ago

    Consider a simple example where you're checking if a file exists, or a database object exists, and creating it if not. Imagine your filesystem or database library either doesn't have an upsert function to do this for you, or else you can't use it because you want some special behaviour for new records (like writing the current timestamp or a running total, or adding an entry to a log file, or something). I think this is a simple, common example where you would want to combine a conditional with an action. I don't think it's very "odd" or indicative of "way too much logic".

    • avianlyric a day ago

      > a database object exists, and creating it if not. Imagine your filesystem or database library either doesn't have an upsert function to do this for you, or else you can't use it because you want some special behaviour for new records (like writing the current timestamp or a running total, or adding an entry to a log file, or something).

      This is why databases have transactions.

      > simple example where you're checking if a file exists

      Personally I avoid interacting directly with the filesystem like the plague due to issues exactly like this. Working with a filesystem correctly is way harder than people think it is, and handling all the edge-cases is unbelievably difficult. If I'm building a production system where correctness is important, then I use abstractions like databases to make sure I don't have to deal with filesystem nuances myself.

      • shawnz a day ago

        Sure, I agree that a transaction should be used here (in the database example at least). But that's orthogonal to my point, or maybe even in favour of it: doesn't a transaction necessitate keeping the conditional close to the effect? It's a perfect example of what I'm trying to say, how do you make sure the conditional happens in the same transaction as the effect, while simultaneously trying to push the conditional towards the root of the code and away from the effect? Transaction boundaries are exactly the kind of thing that makes pushing up the conditionals difficult.

        • avianlyric 13 hours ago

          By pushing up the transaction boundary. The only reason why the conditional is important is because it part of a larger sequence of operations that you want to complete in an atomic fashion.

          Your transaction needs to encompass all of those operations, not just parts of it.

  • jknoepfler a day ago

    Probably implicit in your #2, but there are two types of people in the world: people who know why you shouldn't try to write a production-grade database from scratch, and people who don't know why you shouldn't try to write a production-grade database from scratch. Neither group should try to write a production-grade database from scratch.

bee_rider a day ago

Maybe write the functions without the checks, then have wrapper functions that just do the checks and then call the internal function?

  • shawnz a day ago

    Is that really achieving OP's goal though, if you're only raising it by creating a new intermediary level to contain the conditional? The conditional is still the same distance from the root of the code, so that seems like it's not in the spirit of what they are saying. Plus you're just introducing the possibility for confusion if people call the unwrapped function when they intended to call the wrapped function

    • Brian_K_White a day ago

      But the checking and the writing really are 2 different things. The "rule" that you always want to do this check before write is really never absolute. Wrapper is exactly correct. You could have the single function and add a param that says skip the check this time, but that is messier and even more dangerous than the seperate wrapper.

      Depends just how many things are checked by the check I guess. A single aspect, checking whether the resource is already claimed or is available, could be combined since it could be part of the very access mechanism itself where anything else is a race condition.

  • astrobe_ a day ago

    It sounds like self-inflicted boilerplate to me.

    • bee_rider a day ago

      If you were going to write the tests anyway, the additional boilerplate for splitting it up and doing a wrapper isn’t so bad (in C at least, maybe it is worse for some language).

      • astrobe_ 15 hours ago

        When you say "isn't so bad", is it just a manner of speech or is it actually a little bad (but it is a compromise?)?

        • bee_rider 10 hours ago

          Well, I was working on a sort of green-field project that did this, and I liked it. It neatly solved the problem of needing the tests, but only wanting to call them on user-provided inputs. However, some caveats:

          * I wasn’t around long enough to see if there was a hidden maintenance cost

          * It was a very thoughtfully designed library in an already-well-understood domain so it wasn’t like we were going to need to change the arguments a ton

          * It was explicitly a library designed to be used as a library from the get-go, so there was a clear distinction of which functions should be user-visible.

          I think I would find it annoying if I was doing exploratory programming and expected to change the arguments often. But, in that case, maybe it is too early to start checking user inputs anyway.