Comment by valzam

Comment by valzam 6 hours ago

3 replies

In particular what exactly once delivery implies that I do not have to worry about it in my processing logic. I can build a `count += 1` and it will always be exactly correct.

The notion that there is no distinction between exactly once delivery and exactly once processing is very odd to me. In practice my processing needs to accommodate duplicates to be correct. If I had exactly once delivery my processing could be much simpler. If I could get exactly once delivery for free I would always choose it in a heartbeat.

jchw 5 hours ago

The point is that it doesn't matter exactly where the deduplication matters. It could happen in your own processing code, or something upstream of it, like a queue library of some kind. That's pretty much what the entire article is saying; it's hard to meaningfully distinguish what part is actually delivery versus processing. e.g. most people would consider the guarantees imparted by the TCP stack are indeed part of delivery and not processing, but your TCP stack is having to do a lot of processing work to actually maintain the logical stream of bytes.

  • warkdarrior 5 hours ago

    > The point is that it doesn't matter exactly where the deduplication matters.

    Actually the point is that once deduplication is done at some layer, the layers above it will have to re-achieve exactly-once delivery.

    "Yes, the TCP layer did deliver this message only once, but the receiving software crashed right after, so now the sender has to send it again."

    • jchw 5 hours ago

      Hmmm. Maybe this is the reason why the processing vs delivery distinction matters. Because my thought is, well of course: To fix that you only send the acknowledgement after processing succeeds.

      But then again, once you do that, the processing code that is being wrapped really doesn't have to care about being idempotent anymore, as it is being handled a layer up. At that point, all it needs to care about is being atomic.

      I'm not sure if it practically matters either way. I'd rather have my processing code be both atomic and idempotent regardless just to make things easier to reason about, as long as it's not too much of a burden. I've always been a fan of concepts like idempotency tokens.