Comment by lisper

Comment by lisper a year ago

Author here. Most commenters still seem to be missing the point, despite the fact that I explicitly say this in the opening sentence:

"This post is ostensibly about an obscure technical issue in distributed systems, but it's really about human communications."

and reiterate it at the end:

"This post was intended to be about human communication more than distributed systems or network protocols."

I really don't know how I could have made this any clearer.

mlyle a year ago

The problem is, you're using strong language like "under any reasonable definition of 'delivery'." But everyone else is defining delivery differently than you, referring to the delivery of the message to the system itself. Your language implies everyone else is unreasonable.

When your argument depends upon everyone else being unreasonable, maybe you're the one being unreasonable.

Yes, we can make the processing that occurs in response to those delivered message(s) idempotent. But in the end, the system has to either deal with:

1. messages being delivered once or lost entirely, or

2. messages being delivered once or multiple times

You are over-explaining a way to deal with situation #2 (detect duplicates at the endpoint).

Reply View 15 replies

lisper a year ago

> referring to the delivery of the message to the system itself
And how do you define "the system itself"?

Reply View | 14 replies
- mlyle a year ago
  
  The thing that is at the end of the lossy medium. It must tolerate (0 or 1) or (1 or more) things being delivered to it.
  
  Reply View | 13 replies
  
  lisper a year ago
  
  Yes, that is true. But why can't I choose to view "the system itself" as the thing that is on the other side of a de-duplicator?
  It feels to me like an argument over whether or not humans can fly. An unassisted human cannot fly, but with some technological augmentation, they can. It seems a bit pedantic to deny that someone can fly from LA to New York simply because they have to get into an airplane to do it.
  
  Reply View | 12 replies

Jach a year ago

This has been a fun thread to read, though not reflecting highly on HN. For what it's worth, I agree with you that the original adage kicking this off is kind of silly, and basically wrong. For further enjoyment I found this interesting blog post from another PhD that addresses things more comprehensively (and also basically agrees with you): https://www.mydistributed.systems/2021/10/exactly-once-deliv...

The opening lines include: "The exact definition, however, is not agreed upon in the community. As a result, there is a debate on whether EOD is possible or impossible to achieve." If nothing else, I and probably others learned today that this is apparently a debate that can quickly turn into a flamewar. And I thought flamewars were mostly dead!

Another interesting paper that came up, as I have an interest in TLA+ proofs: "LogPlayer: Fault-tolerant Exactly-once Delivery using gRPC Asynchronous Streaming" https://arxiv.org/abs/1911.11286 It seems there's no problem in the community to do things like prove fault-tolerant exactly-once delivery, even if such terminology isn't universally agreed on.

Reply View 1 reply

lisper a year ago

Thank you for that link! You just saved me a lot of time reinventing that wheel.

Reply View | 0 replies

pyrolistical a year ago

Don’t use technical terms like “exactly-once delivery” if you don’t have want it to be interrupted as technical

Reply View 12 replies

lisper a year ago

It is hard to make the point that "exactly-once delivery" is not a technical term without referring to it. If you think it is a technical term, would you kindly point me to a definition? I'm particularly interested in learning how "exactly-once delivery" is distinguished from "exactly-once processing".

Reply View | 11 replies
- mlyle a year ago
  
  A 4 year old piece laying out the exact difference as it's understood and how people use the terms:
  https://blog.bulloak.io/post/20200917-the-impossibility-of-e...
  I read similar content at least 2 decades ago...
  > While exactly-once-delivery is not possible, we have a way out: Exactly-once processing. Exactly-once processing is the guarantee that even though we may receive a message multiple times, in the end, we observe the effects of a single processing operation. This can be achieved in two ways:
  > Deduplication: dropping messages if they are received more than once
  > Idempotent processing: applying messages more than once has precisely the same effect as applying it exactly once
  (I view deduplication as a special case of idempotency).
  
  Reply View | 7 replies
  
  lisper a year ago
  
  > A 4 year old piece laying out the exact difference...
  "Exactly-once delivery guarantee is the guarantee that a message can be delivered to a recipient once, and only once."
  That seems circular to me.
  Also, the author's proof is flawed. The 2GP requires more than exactly-once delivery, it requires common knowledge. It is not enough for the first general to know that the message will be received, it is required that the first general knows that the message has been received, and that the second general knows that the first general knows this, and that the first general knows that the second general knows... and so on.
  
  Reply View | 6 replies
- dragonwriter a year ago
  
  Draw a diagram containing a source system, a destination system, and an unreliable communication channel between them. The destination system also has an output with no unreliable communications channel.
  Exactly-once delivery means that a message sent from the source system reaches the destination system exactly once, and its result reaches the output channel exactly once as a consequence.
  Excactly-once processing means that a message sent from the source system produces the expected output from the destination system once, even though it may be received by the destination system more than once.
  (That's a little sloppy because it could use more discussion of the conditions in which it won't be received zero times, and how those are different between exactly-once and at-most-once delivery, but that's mostly beside the point because it isn't part of the distinction between exactly-once delivery and exactly-once processing. And, while definitely technical, they always involved a somewhat idealized view of the destination system, because all communications channels, including those internal to a single device, have some degree of unreliability.)
  
  Reply View | 2 replies
  
  lisper a year ago
  
  > That's a little sloppy
  Yes, that is exactly my point. The only way you can make it non-sloppy is to define "delivery" as being something that happens exclusively upstream of deduping.
  
  Reply View | 1 reply
  
  dragonwriter a year ago
  
  No, I'm saying its "sloppy" as a definition because while it addresses the distinction you ask about it, it doesn't fully cover what distinguishes exactly-once from at-most-once.
  > The only way you can make it non-sloppy is to define "delivery" as being something that happens exclusively upstream of deduping.
  "Deduping" can happen in many places. If it happens anywhere before the destination system end of the unreliable connection it is part of delivery (but also can't get you to exactly-once delivery). If it happens on the destination side of the unreliable communication channel, then yes, it's not part of the delivery guarantee, it is how you get exactly-once processing from at-least-once delivery. This has been well-known for a very long time. (I don't think it was new when I first encountered it in 1999.)
  
  Reply View | 0 replies

tacitusarc a year ago

Right, but the meaning of “delivery” in human communication is still different from how you are using it…

Reply View 1 reply

lisper a year ago

I disagree. If I'm an actual general in a fort with a gate, and I tell the gate guards to inspect messages brought in by couriers and not allow couriers carrying duplicate messages to enter, the ones that the guards let through are still delivering those messages to me.

Reply View | 0 replies

noaoh a year ago

I had a similar thought as what's in your post, but didn't share because I thought the reaction would be something like this. I personally would word this as you can achieve exactly once semantics by combining at least once delivery with idempotency.

Reply View 0 replies

stackghost a year ago

Your proposed definition of "delivery" is absurd.

If you have duplicate things, then you've clearly been delivered more than one thing. There is no way to deliver something exactly once, and yet the receiver has more than one thing such that they can throw all but one thing away.

It's okay to admit you were mistaken.

Reply View 12 replies

lisper a year ago

> If you have duplicate things, then you've clearly been delivered more than one thing.
Yes, that's true. But this doesn't turn on what "delivery" means, it turns on what "you" means. If "you" are downstream of a de-duplication mechanism, then "you" can get exactly once-delivery. Why is that so absurd?

Reply View | 11 replies
- stackghost a year ago
  
  >Yes, that's true. But this doesn't turn on what "delivery" means, it turns on what "you" means. If "you" are downstream of a de-duplication mechanism, then "you" can get exactly once-delivery. Why is that so absurd?
  So in the case of, say, a network service on server A and a network client B, your solution to "exactly once delivery" is to re-define it as "deliver it from A to B multiple times and have B deduplicate"?
  Do you not see how nonsensical that is to call that "exactly once delivery"?
  
  Reply View | 0 replies
- plorkyeran a year ago
  
  If you have a reliable connection between “you” and the deduplicator, then “you” aren’t receiving messages over an unreliable connection at all and so the claim that you can’t have exactly once delivery over an unreliable connection isn’t applicable in the first place. You’re receiving messages over a reliable connection and what happens upstream of that is irrelevant.
  
  Reply View | 0 replies
- srkiNZ a year ago
  
  If "you" are "downstream" of a "de-duplication mechanism" how do you ensure "exactly once delivery" between "you" and the "de-duplication mechanism"?
  
  Reply View | 2 replies
  
  lisper a year ago
  
  The same way I ensure any behavior in a digital system. There are boundaries inside of which processes are presumed to be reliable, typically consisting of a CPU, memory busses, and attached non-volatile storage. If you don't assume that those are reliable, then you can't guarantee anything.
  
  Reply View | 1 reply
  
  srkiNZ a year ago
  
  Great! I agree 100%. We have to assume a "reliable network" within a "boundary" (i.e. a computers CPU, memory, busses etc...). Distributed systems (from which these rules are taken) are specifically systems where anything within one of these "boundaries" is considered a "single node" and treated the same, whether it's a NIC, a kernel module/driver, a user space process or anything else.
  In our case if we were to take (for example) that the NIC would de-duplicate the messages for us, anyone writing the producer/sender and a user-space program for the receiver a would need to know that the NIC was doing this (or risk having messages dropped for failure of including a unique id).
  This is a pedantic point, but I would strongly stress that the only reason these "delivery rules" are so popular (and evoke such a reaction) is because of the very large number of times that programmers mis-understand them.
  Commonly they either assume that:
  * the network is reliable
  * something else will guarantee "exactly once delivery" for me (when in fact nothing will)
  
  Reply View | 0 replies
- computerfan494 a year ago
  
  You're correct, but in my experience the vast majority of code written is not downstream of a usable de-duplication mechanism.
  
  Reply View | 5 replies
  
  lisper a year ago
  
  That may well be, but that's a very different question.
  
  Reply View | 4 replies

bigstrat2003 a year ago

I'm sorry people are being such utter jerks about this. Honestly, the comments on this are an embarrassment. Even if you are wrong, there's absolutely no call for the tone a lot of the comments are taking.

Reply View 1 reply

lisper a year ago

Thanks.

Reply View | 0 replies