Comment by dangoodmanUT

Comment by dangoodmanUT 11 hours ago

4 replies

Maybe it's not obvious initially, but in retrospect, this handling of joins feels like the obvious way to handle it.

Push down filters to read the least data possible.

Or, know your data and be able to tell the query engine which kind of join strategy you would like (hash vs push down)

SoftTalker 11 hours ago

Decades ago we used to provide hints in queries based on "knowing the data" but modern optimizers have a lot better statistics on indexes, and the need to tell the query optimizer what to do should be rare.

  • btilly 9 hours ago

    Yes, but the problem is that optimizers will sometimes change join conditions without warning in production.

    There is a real need to be able to take key queries and say, "don't change the way you run this query". Most databases offer this. Unfortunately PostgreSQL doesn't. There are ways to force the join (eg using a series of queries with explicit temporary tables), but all create overhead. And the result is that a PostgreSQL website will sometimes change a good query plan to a bad one, then have problems. Just because it is Tuesday.

    • magicalhippo 7 hours ago

      > There is a real need to be able to take key queries and say, "don't change the way you run this query".

      We've hit this with MSSQL too. Suddenly production is down because for whatever reason MSSQL decided to forget its good plan and instead table scan, and then continue to reuse that cached table-scanning plan.

      For one specific query MSSQL likes to do this with at a certain customer we've so far just added the minutes since start of year as a dummy column, while we work on more pressing issues. Very blunt, yet it works.

[removed] 11 hours ago
[deleted]