Comment by vjerancrnjak

Comment by vjerancrnjak 8 hours ago

6 replies

Another example of row based dbs somehow being insanely slow compared to column based.

Just an endless sequence of misbehavior and we’re waving it off as rows work good for specific lookups but columns for aggregations, yet here it is all the other stuff that is unreasonably slow.

tharkun__ 8 hours ago

It's an example. But not of that.

It's an example of old things being new again maybe. Or reinventing the wheel because the wheel wasn't known to them.

Yes I know, nobody wants to pay that tax or make that guy richer, but databases like Oracle have had JPPD for a long time. It's just something the database does and the optimizer chooses whether to do it or not depending on whether it's the best thing to do or not.

  • rotis 7 hours ago

    Exactly. This is a basic optimization technique and all the dinosaur era databases should have that. But if you build a new database product you have to implement these techniques from scratch. There is no way you shortcut that. Reminds me about CockroachDB and them building a query optimizer[1]. They started with rule based one and then switched to cost based. Feature that older databases already had.

    [1] https://www.cockroachlabs.com/blog/building-cost-based-sql-o...

sschnei8 8 hours ago

I feel like this is more an example of:

“We filtered first instead of reading an entire table from disk and performing a lookup”

Where both OLAP and OLTP dbms would benefit.

To your point, it’s clear certain workloads lend themselves to OLAP and columnar storage much better, but “an endless sequence of misbehavior” seems a bit harsh .

  • vjerancrnjak 4 hours ago

    It's not harsh.

    Recent example, have 1GB of data in total across tables. Query needs 20 minutes. Obvious quadratic/cubic-or-even-worse behavior.

    I disable nested loop join and it's 4 seconds. Still slow, but don't want to spend time figuring out why it's slower than reading 1GB of data and pipelining the computation so that it's just 1 second, or even faster given the beefy NVME where files are stored (ignoring that I actually have good indices and the surface area of the query is probably 10MB and not 1GB).

    How can the strategy be slower than downloading 1GB of data and gluing it together in Python?

    Something is just off with the level of abstraction, query planner relying on weird stats. The whole system, outside of its transactional guarantees, just sucks.

    Another example where materializing CTE reduces exec time from 2 seconds to 50ms, because then you somehow hint to the query planner that result of that CTE is small.

    So even PostgreSQL is filled with these endless riddles in misbehavior, even though PhDs boast about who knows what in the query optimizer and will make an effort to belittle my criticism by repeating the "1 row" vs "agg all rows" as if I'm in elementary school and don't know how to use both OLTP or OLAP systems.

    Unlike column dbs where I know it's some nice fused group-by/map/reduce behavior where I avoid joins like plague and there's no query planner, stats maintenance, indices, or other mumbo-jumbo that does not do anything at all most of the time.

    Most of my workloads are extremely tiny and I am familiar with how to structure schemas for OLTP and OLAP and I just dislike how most relational databases work.

    • atombender an hour ago

      I think part of the problem is that the people working on Postgres for the most part aren't PhDs, and Postgres isn't very state of the art.

      Postgres implements the ancient Volcano model from the 1980s, but there's been a ton of query optimization research since then, especially from the database groups at TUM Munich, University of Washington, and Carnegie Mellon. Systems like HyPer and Umbra (both at TUM) are state of the art query planners that would eat Postgres' lunch. Lots of work on making planners smarter about rearranging joins to be more optimal, improving cache locality and buffer management, and so on.

      Unfortunately, changing an old Volcano planner and applying newer techniques would probably be a huge endeavor.

    • thr0w 2 hours ago

      > I disable nested loop join and it's 4 seconds.

      I feel your pain. I've been through all stages of grief with `enable_nestloop`. I've arrived at acceptance. Sometimes you just need to redo your query approach. Usually by the time I get the planner to behave, I've ended up with something that's expressed more simply to boot.