dtenwolde a day ago

Hi there, leading DuckPGQ developer here :) Thanks for the shoutout! I've been busy working on an internship at DuckDB labs so DuckPGQ has gotten less attention, but I'll get back to it soon (December most likely) and will update the extension to support DuckDB v1.4.0 and v1.4.1 this week hopefully.

adsharma 2 days ago

PGQ requires you to write using SQL and read using a graph query language. GQL is a standalone language that supports reads/writes. But much of the community is still using cypher.

More on this here:

https://adsharma.github.io/beating-the-CAP-theorem-for-graph...

  • aftbit 2 days ago

    As far as I can tell, this has nothing to do with CAP theorem or distributed systems. It's just being used as an analogy.

    > [CAP theorem] states that any distributed storage system can provide only two of these three guarantees: Consistency, Availability and Partition safety.

    > In the realm of graph databases, we observe a similar “two out three” situation. You can either have scalable systems that are not fully open source or you can have open source systems designed for small graphs. Details below.

    (the article follows)

    > This is one solution to the CAP theorem for graphs. We can store a billion scale graph using this method in parquet files and use a free, cheap and open source solution to traverse them, perform joins without storage costs that are prohibitively high.

    • adsharma 2 days ago

      That's right - it was a fun 2 out of 3 analogy.

      The real question being raised in the blog post is - should the next generation graph databases pursue a local-only embedded strategy or build on top of object storage like many non-graph and vector embedded databases are doing.

      Specifically, DuckLake (using system catalog for metadata instead of JSON/YAML) is interesting. I became aware of Apache GraphAr (incubating) after writing the blog post. But it seems to be designed for data interchange between graph databases instead of designing for primary storage.

      • aftbit 2 days ago

        I only mentioned it because I clicked it wondering if someone had found a way to "cheat" CAP for graph databases. When I saw that it was being used as an analogy and not literally, I figured I'd comment.

        I still don't quite get the analogy. What are the 2 out of 3 that you can have? The second paragraph I quoted gives a classic 1 out of 2 dilemma - either scalable _or_ open-source.

        • adsharma a day ago

          DuckDB is scalable (can handle TPC-H 1TB) and open source, but doesn't support graphs natively. It supports some graph queries on a SQL native columnar storage.

          With the proposed solution, you'll be able to query larger graphs on an open source graph native engine. Thus beating the "CAP theorem for graphs".

jabr 2 days ago

DuckPGQ is an interesting option, but unfortunately, that project hasn't been touched in a few months and does not currently work with the latest version of DuckDB.

  • dtenwolde a day ago

    Hi there, leading DuckPGQ developer here. I've been busy with other projects but will get back to it soon enough :)