DavidWoof 6 hours ago

In OP's defense, "becoming suspicious" doesn't mean it's always wrong. I would definitely suggest an explaining comment if someone is using DISTINCT in a multi-column query.

mbb70 10 hours ago

The very next ask will be "order the zipcodes by number of customers" at which point you'll be back to aggregations, which is where you should have started

  • wvbdmp 9 hours ago

    Anti-Patterns You Should Avoid: overengineering for potential future requirements. Are there real-life cases where you should design with the future in mind? Yes. Are there real-life cases where DISTINCT is the best choice by whatever metric you prioritize at the time? Also yes.

    • RHSeeger 9 hours ago

      > Are there real-life cases where DISTINCT is the best choice by whatever metric you prioritize at the time

      Indeed, along that line, I would say that DISTINCT can be used to convey intent... and doing that in code is important.

      - I want to know the zipcodes we have customers in - DISTINCT

      - I want to know how many customers we have in each zipcode - aggregates

      Can you do the first with the second? Sure.. but the first makes it clear what your goal is.

      • dleeftink 7 hours ago

        Partly in jest, but maybe we need a NON-DISTINCT signaller to convey the inverse and return duplicate values only.

        SOMEWHAT-DISTINCT with a fuzzy threshold would also be useful.

        • RHSeeger 4 hours ago

          I hear you. It's not all _that_ uncommon for me to query for "things with more than one instance". Although, to be fair, it's more common for me to that when grep/sort/uniqing logs on the command line.

  • majormajor 8 hours ago

    Here we start to get close to analytics sql vs application sql, and I think that's a whole separate beast itself with different patterns and anti-patterns.

    • bandrami 4 hours ago

      Ah, yeah, you beat me to it. I do reporting, not applications.

  • bandrami 4 hours ago

    I do reporting, not application development. If somebody wants to know different information I'd write a different query.

  • edoceo 9 hours ago

    count(id) group by post_code order by 1