Comment by alberth
Odd that OLTP wasn’t mentioned in the article.
Postgres an an OLTP databases, which are designed for write heavy workloads.
While that being said, I agree most people have read-heavy needs.
Odd that OLTP wasn’t mentioned in the article.
Postgres an an OLTP databases, which are designed for write heavy workloads.
While that being said, I agree most people have read-heavy needs.
Hmmm. Not really. Yes, everything is a mix, but for applications, it very much is on the read-heavy side. Think about how many queries you have to do just to display an arbitrary page. You might, maybe, just maybe, net 2-3 writes vs. hundreds of reads. If that starts to balance out, or even flip, then you probably need to rethink your database as you start to exit traditional db usage patterns. But <30% writes is not write-heavy.
I am thinking about that. I don't think most data is read that often in an OLTP system.
I think a very small amount of data is read very often. However, until your DB gets very large, that data is going to end up as data pages cached in memory. So that data is extremely cheap to read.
I also think a significant amount of data that is generated in an OLTP system is written and never read, but you still had to pay the cost to write it. If you have an audit log, chances are you never need to look at it for any one piece of data. But you definitely had to write all the metadata for it.
But I'm also assuming that writes are at least 10 times as expensive as reads. More so if what you're modifying has indexes, since indexes are often functionally identical to a partial copy of the entire table. Indeed, I think that 10 times mark is conservative. Most RDBMSs use transaction logging and some kind of locking on writes. There's data validation and integrity checks on inserts and updates (and deletes if you have foreign keys).
I think 1 write to 10 reads is still write-heavy.
I think read replicas disagree with that pretty strongly.
The write traffic may be very write heavy, but then you have many, many users who need to see that data. The question is whether the database or a copy of the data from the database is what services that interest.
If you mediate all reads through a cache, then you have split the source of truth from the system of record. And then the read traffic on the system of record is a much lower ratio.
I disagree. I think the only people that have read-heavy needs are big data and data warehouses. AI being hot right now doesn't mean big data is the only game in town.
Most applications are used operationally or have a mix of read and write. Even on applications where the user can only consume content present there, there is often more than enough data capture just tracking page history to be relatively write heavy.