Stonebraker on CAP theorem and Databases (2010)

(perspectives.mvdirona.com)

78 points by onurkanbkrc a day ago

sethev a day ago

Normally, I'm not a fan of putting the date on a post. However, in this case, the fact that Stonebraker's article was published in 2010 makes it more impressive given the developments over the last 15 years - in which we've relearned the value of consistency (and the fact that it can scale more than people were imagining).

Reply View 2 replies

alecco a day ago

Papers by Stonebraker on that:
"What goes around comes around" (2005) https://web.eecs.umich.edu/~mozafari/fall2015/eecs584/papers...
"What Goes Around Comes Around... And Around..." (2024) https://dl.acm.org/doi/pdf/10.1145/3685980.3685984 (ft Andy Pavlo)
Or video (2025) https://www.youtube.com/watch?v=8Woy5I511L8

Reply View | 1 reply
- trympet 11 hours ago
  
  As someone not very deep into databases, these papers were fun to read.
  
  Reply View | 0 replies

nine_k a day ago

In short: eventual consistency is insufficient in many real-world error scenarios which are outside the CAP theorem. Go for full consistency where possible, which is more practical cases than normally assumed.

Reply View 3 replies

candiddevmike a day ago

But full consistency isn't web scale! There are a lot of times where full consistency with some kind of cache in front of it has the same client quirks as eventually consistency though.
As always, the answer is "it depends".

Reply View | 2 replies
- sgarland a day ago
  
  Did you unironically use the term “web scale” in reference to a database?
  
  Reply View | 0 replies
- DeathArrow a day ago
  
  >But full consistency isn't web scale!
  But /dev/null is!
  
  Reply View | 0 replies

johnmwilkinson a day ago

Reply View 1 reply

lencastre a day ago

til, thx

Reply View | 0 replies

dfajgljsldkjag a day ago

I think we try too hard to solve problems that we do not even have yet. It is much better to build a simple system that is correct than a messy one that never stops. I see people writing bad code because they are afraid of the network breaking. We should just let the database do its job.

Reply View 1 reply

pyrolistical a day ago

But consistency is a choice we need to make when deciding how to use a database.
By letting databases do its job, you let someone else make a trade off for you

Reply View | 0 replies

thayne 17 hours ago

> Hence, in my opinion, one is much better off giving up P rather than sacrificing C. (In a LAN environment, I think one should choose CA rather than AP).

That isn't how it works. The only way to completely avoid network partitions is to not have a network, i.e. not have a distributed system. Sure partitions are rare, but unless your network is flawless, and your nodes never go down, they do happen sometimes, and when they do, you have to choose between consistency and availability.

That said, in most cases consistency is probably what you want.

Reply View 0 replies

levischoen 19 hours ago

https://web.archive.org/web/20260131005358/https://perspecti...

Reply View 0 replies

onethumb a day ago

Probably needs a (2010) label. Great article, though.

Reply View 0 replies

wippler a day ago

FYI. This was written in 2010 although it feels relevant even now. Didn't catch it until the mention of Amazon SimpleDB.

Reply View 1 reply

redwood a day ago

Indeed I was wondering for a moment if amazon had decided to double down on it

Reply View | 0 replies

redwood a day ago

This is why the winning disturbed systems optimize for CP. It's worth preserving consistency at the expense of rare availability losses particularly on cloud infrastructure

Reply View 1 reply

pyrolistical a day ago

Also, giving up on availability doesnt imply we will be down for a longtime. It’s just some requests might get dropped. As long as the client knows the request was rejected they can try later.

Reply View | 0 replies

oooyay a day ago

A lot of these kinds of discussions tend to wipe away all the nuance around why you would or wouldn't care about consistency. Most of the answer has to do with software architecture and some of it has to do with use cases.

Reply View 0 replies

belter a day ago

The 2010 is really important here. And Stonebraker is thinking about local databases systems and was a bit upset but the NoSQL movement push at the time.

And he is making a mistake in claiming the partitions are "exceedingly rare". Again he is not thinking about a global distributed cloud across continents.

The real world works with Eventual Consistency. Embrace it, for most 90% of the Business Scenarios its the best option: https://i.ibb.co/DtxrRH3/eventual-consistency.png

Reply View 24 replies

anonymars a day ago

Remember also that "partition" is not "yes or no" but rather a latency threshold. If the network is connected but a call now takes 30 seconds instead of milliseconds, that is probably a partition

Reply View | 2 replies
- convolvatron 17 hours ago
  
  this is likely wrong. the issue with partitions is that we can no longer communicate at all, thus we can't end up in the same state. If we have poor performance, thats certainly something that worth putting machinery in to adapt to, but its not at all in the same class as 'I can't talk to you and I dont know what you're doing at all' fro a correctness standpoint
  edit: yeah ok, since failure detection is being driven by timers by necessity, then sure. the tradeoff we're making between the interval under which we're unable to make progress vs the upheaval caused by announcing a failure.
  
  Reply View | 1 reply
  
  anonymars 16 hours ago
  
  Yeah, I glossed over a few steps. There's likely a latency threshold beyond which you should abort, and then it is a partition (after all, that's what TCP is doing under the hood if it sends a packet and doesn't get a response).
  One should be so lucky to have an operation fail immediately, rather than lumber on until it times out (holding resources hostage all the while)!
  
  Reply View | 0 replies
majormajor a day ago

> And he is making a mistake in claiming the partitions are "exceedingly rare". Again he is not thinking about a global distributed cloud across continents.
Any time an AWS region or AZ goes down we see a lot of popular services go nearly-completely-down. And it's generally fine.
One thing I appreciate about AWS is that (operating "live" in just a single AZ or even single region) I've seen far fewer partition-causing networking hiccups than when my coworkers and I were responsible for wiring and tuning our own networks for our own hardware in datacenters.

Reply View | 0 replies
hobs a day ago

I would say quite the opposite - most business have little need for eventual consistency and at a small scale its not even a requirement for any database you would reasonably used, way more than 90% of companies don't need eventual consistency.

Reply View | 19 replies
- belter a day ago
  
  No. The real world is full of eventual consistency, and we simply operate around it. :-)
  Think about a supermarket: If the store is open 24/7, prices change constantly, and some items still have the old price tag until shelves get refreshed. The system converges over time.
  Or airlines: They must overbook, because if they wait for perfect certainty, planes fly half empty. They accept inconsistency and correct later with compensation.
  Even banking works this way. All database books have the usual “you can’t debit twice, so you need transactions”…bullshit. But think of a money transfer across banks and possibly across countries? Not globally atomic...
  What if you transfer money to an account that was closed an hour ago in another system? The transfer doesn’t instantly fail everywhere. It’s posted as credit/debit, then reconciliation runs later, and you eventually get a reversal.
  Same with stock markets: Trades happen continuously, but final clearing and settlement occur after the fact.
  And technically DNS is eventual consistency by design. You update a record, but the world sees it gradually as caches expire. Yet the internet works.
  Distributed systems aren’t broken when they’re eventually consistent. They’re mirroring how real systems work: commit locally, reconcile globally, compensate when needed.
  
  Reply View | 18 replies
  
  sethev 17 hours ago
  
  These analogies (except for DNS, perhaps) aren't very illuminating on the difference between a CP system and an AP system in the CAP sense, though. In banking, there are multiple parties involved. Each of those parties is likely running a CP system for their transactions (almost guaranteed). Same with stock exchanges - you can look up Martin Thompson's work for a public glimpse of how these systems work (LMAX and Aeron are systems related to this).
  These examples are closer to control loops, where a decision is made and then carried out or finalized later. This kind of "eventual consistency" is pervasive but also significantly easier to reason about than what people usually mean by that term when talking about a distributed database, for example.
  To expand on the 24/7 grocery store example: if the database with prices is consistent, you will always know what the current price is supposed to be. If the database is eventually consistent, you may get inconsistent answers about the current price that have to be resolved in the code somehow. That's way harder to reason about then "the price changed, but the tag hasn't been hung yet". The first case, professional software engineers struggle to deal with correctly. The second case, anyone can understand.
  
  Reply View | 0 replies
  
  hobs 20 hours ago
  
  None of the systems you describe are the 90% of businesses - grocery, airlines, banking, stock markets, dns, they are all modeling huge systems with very active logistics compared to most businesses, I still don't agree with you at all.
  Banks across countries - not again a problem most businesses ever have to deal with.
  
  Reply View | 0 replies
  
  awesome_dude a day ago
  
  > Even banking works this way. All database books have the usual “you can’t debit twice, so you need transactions”…bullshit. But think of a money transfer across banks and possibly across countries? Not globally atomic..
  Banking is my "go to" anology when it comes to eventual consistency because 1: We use banking almost universally the same ways, and 2: we understand fully the eventual consistency employed (even though we don't think about it)
  Allow me to elaborate.
  When I was younger we had "cheque books" which meant that I could write a cheque (or check if you're American) and give it to someone in lieu of cash, they would take the cheque to the bank, and, after a period of time their bank would deposit funds into their account, and my bank would debit funds from mine - that delay is eventual consistency.
  That /style/ of banking might have gone for some people, but the principle remains the same, this very second my bank account is showing me two "balances", the "current" balance and the "available" balance. Those two numbers are not equal, but they will /eventually/ be consistent.
  The reason that they are not consistent is because I have used my debit card, which is really a credit arrangement that my bank has negotiated with Visa or Mastercard, etc. Whereby I have paid for some goods/services with my debit card, Visa has guaranteed the merchant that they will be paid (with some exceptions) and Visa have placed a hold on the balance of my account for the amount.
  At some point - it might be overnight, it might be in a few days, there will be a reconciliation where actual money will be paid by my bank to Visa to settle the account, and Visa will pay the merchant's bank some money to settle the debt.
  Once that reconciliation takes place to everyone's satisfaction, my account balances will be consistent.
  
  Reply View | 15 replies

MORPHOICES a day ago

[dead]

Reply View 0 replies

[removed] a day ago

[deleted]

Reply View 0 replies