Comment by kukkeliskuu

Comment by kukkeliskuu a day ago

12 replies

I have been working on payment systems and it seems that in almost all discussions about transactions, people talk about toy versions of bank transactions that have very little to do with what actually happens.

You don't even need to talk about credit cards to have multiple kinds of accounts (internal bank accounts for payment settlement etc.), multiple involved systems, batch processes, reconciliation etc. Having a single atomic database transaction is not realistic at all.

On the other hand, the toy transaction example might be useful for people to understand basic concepts of transactions.

mrkeen a day ago

And then they take that toy transaction model and think that they're on ACID when they're not.

Are you stepping out of SQL to write application logic? You probably broke ACID. Begin a transaction, read a value (n), do a calculation (n+1), write it back and commit: The DB cannot see that you did (+1). All it knows is that you're trying to write a 6. If someone else wrote a 6 or a 7 in the meantime, then your transaction may have 'meant' (+0) or (-1).

Same problem when running on reduced isolation level (which you probably are). If you do two reads in your 'transaction', the first read can be at state 1, and the second read can be at state 2.

I think more conversations about the single "fully consistent" db approach should start with it not being fit-for-purpose - even without considering that it can't address soft-modification (which you should recognise as a need immediately whenever someone brings up soft-delete) or two-generals (i.e. consistency with a partner - you and VISA don't live in the same MySql instance, do you? Or to put it in moron words - partitions between your DB and VISA's DB "don't happen often" (they happen always!))

  • fishstamp82 a day ago

    RE: "All it knows is that you're trying to write a 6. If someone else wrote a 6 or a 7 in the meantime, then your transaction may have 'meant' (+0) or (-1)."

    This is not how it works at all. This is called dirty writes and is by default prevented by ACID compliant databases, no matter the isolation level. The second transaction commit will be rejected by the transaction manager.

    Even if you start a transaction from your application, it does not change this still.

    • mrkeen 20 hours ago

      I have no problem with ACID the concept. It's a great ideal to strive towards. I'm sure your favourite RDBMS does a fine job of it. If you send it a single SQL string, it will probably behave well no matter how many other callers are sending it SQL strings (as long as the statements are grouped appropriately with BEGIN/COMMIT).

      I'm just pointing out two ways in which you can make your system non-ACID.

      1) Leave it on the default isolation level (READ_COMMITTED):

      You have ten accounts, which sum to $100. You know your code cannot create or destroy money, only move it around. If no other thread is currently moving money, you will always see it sum to $100. However, if another thread moves money (e.g. from account 9 to account 1) while your summation is in progress, you will undercount the money. Perfectly legal in READ_COMMITTED. You made a clean read of account 1, kept going, and by the time you reach account 9, you READ_ what the other thread _COMMITTED. Nothing dirty about it, you under-reported money for no other reason than your transactions being less-than-Isolated. You can then take that SUM and cleanly write it elsewhere. Not dirty, just wrong.

      2) Use an ORM like LINQ. (Assume FULL ISOLATION - even though you probably don't have it)

      If you were to withdraw money from the largest account, split it into two parts, and deposit it into two random accounts, you could do it ACID-compliantly with this SQL snippet:

          SELECT @bigBalance = Max(Balance) FROM MyAccounts
          SELECT @part1 = @bigBalance / 2;
          SELECT @part2 = @bigBalance - @part1;
          ..
          -- Only showing one of the deposits for brevity
          UPDATE MyAccounts
          SET Balance = Balance + @part1
          WHERE Id IN (
              SELECT TOP 1 Id
              FROM MyAccounts
              ORDER BY NewId()
          );
      
      Under a single thread it will preserve money. Under multiple threads it will preserve money (as long as BEGIN and COMMIT are included ofc.). Perfectly ACID. But who wants to write SQL? Here's a snippet from the equivalent C#/EF/LINQ program:

          // Split the balance in two
          var onePart = maxAccount.Balance / 2;
          var otherPart = maxAccount.Balance - onePart;
      
          // Move one half
          maxAccount.Balance -= onePart;
          recipient1.Balance += onePart;
      
          // Move the other half
          maxAccount.Balance -= otherPart;
          recipient2.Balance += otherPart;
      
      Now the RDBMS couldn't manage this transactionally even if it wanted to. By the final lines, 'otherPart' is no longer "half of the balance of the biggest account", it's a number like 1144 or 1845. The RDBMS thinks it's just writing a constant and can't connect it back to its READ site:

          info: 1/31/2026 17:30:57.906 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command) 
              Executed DbCommand (7ms) [Parameters=[@p1='a49f1b75-4510-4375-35f5-08de60e61cdd', @p0='1845'], CommandType='Text', CommandTimeout='30']
              SET NOCOUNT ON;
              UPDATE [MyAccounts] SET [Balance] = @p0
              WHERE [Id] = @p1;
              SELECT @@ROWCOUNT;
arter45 a day ago

I don't have a lot of payment experience, but AFAIK actual payment systems work in an append-only fashion, which makes concurrency management easier since you're just adding a new row with (timestamp, from, to, value, currency, status) or something similar. However, how can you efficiently check for overdrafts in this model? You'd have to periodically sum up transactions to find the sender's balance and compare it to a known threshold.

Is this how things are usually done in your business domain?

  • mrkeen 19 hours ago

    > how can you efficiently check for overdrafts in this model?

    You already laid the groundwork for this to be done efficiently: "actual payment systems work in an append-only fashion"

    If you can't alter the past, it's trivial to maintain your rolling sums to compare against. Each new transaction through the system only needs to mutate the source and destination balances of that individual transaction.

    If you know everyone's balance as of 10 seconds ago, you don't need to consider any of the 5 million transactions that happened before 10 seconds ago.

    (If your system allowed you to alter the past and edit arbitrary transactions in the past, you could never trust your rolling sums, and you'd be back to summing up everything for every operation.)

    • arter45 19 hours ago

      So you're saying each line records the new value of the source and destination balance, rather than just the sum that is being exchanged?

      • mrkeen 16 hours ago

        No.

        At the beginning of time, all your accounts will have their starting value.

        When the first transaction (from,to,value) happens, you will do one overdraft check, and if it's good, you will do 1 addition and 1 subtraction, and two of the accounts will have a new value.

        On the millionth transaction, you will do one overdraft check, and if it's good, you will do 1 addition and 1 subtraction, and two of the accounts will have a new value.

        At no point will you need to do more than one check & one add & one sub per arriving transaction.

        (The append-only is what allows this: the next state is only ever a single, cheap step from the current state. But if someone insists upon mutating history, the current state is no longer valid because it no longer represents the history that led up to it, so it cannot be used to generate the next state - you need to throw it all away and regenerate the current/next states, starting from 0 and replaying every transaction again.

        • arter45 16 hours ago

          Ok so basically you have a Transactions table as well as a separate Accounts table which stores balances, and every time Alices wishes to pay Bob, a (database) transaction appends an entry to the Transaction table and updates balance in Accounts only if the sender’s balance is ok? Something like a “INSERT INTO…SELECT”?

      • awesome_dude 16 hours ago

        The rolling balance is a "projection"

        Your bank statement has the event (A deposit or withdrawal) with details, and to one side the bank will say, your balance after this event can be calculated to be $FOO

        The balance isn't a part of the event, it's a calculation based on the (cached) balance known from the previous event.

        Further, your bank statements are (typically) for the calendar month, or whatever. They start with the balance bought forward from the previous statement (a snapshot)

awesome_dude a day ago

The point is to give people who don't realise that they have been dealing with eventual consistency all along, that it's right there, in their lives, and they already understand it.

You're right I go into too much detail (maybe I got carried away with the HN audience :-) and you are right that multiple accounts is something else that people generally already understand and demonstrate further eventual consistency principles.

  • kukkeliskuu a day ago

    I wasn't criticizing you, just making the point that when people talk about toy example bank transactions, they usually want to just introduce the basic understanding. And I think it ok, but I would prefer that they would also mention that REALLY the operations are complex.

    I modified my comment above that by multiple types of accounts I meant that banks have various accounts for settlements with the other banks etc. even in the common payment case.

    • awesome_dude 16 hours ago

      My bad, I didn't mean to sound too upset, but I do get a bit "trigger happy" from time to time