xfalcox 11 hours ago

In Discourse embeddings power:

- Related Topics, a list of topics to read next, which uses embeddings of the current topic as the key to search for similar ones

- Suggesting tags and categories when composing a new topic

- Augmented search

- RAG for uploaded files

  • nextaccountic 5 hours ago

    what does the rag for uploaded files do in discourse?

    also, when i run a discourse search does it really do both a regular keyword search and a vector search? how do you combine results?

    does all discourse instances have those features? for example, internals.rust-lang.org, do they use pgvector?

    • xfalcox 28 minutes ago

      > what does the rag for uploaded files do in discourse?

      You can upload files that will act as RAG files for an AI bot. The bot can also have access to forum content, plus the ability to run tools in our sandboxed JS environment, making it possible for Discourse to host AI bots.

      > also, when i run a discourse search does it really do both a regular keyword search and a vector search? how do you combine results?

      Yes, it does both. In the full page search it does keyword first, then vector asynchronously, which can be toggled by the user in the UI. It's auto toggled when keyword has zero results now. Results are combined using reciprocal rank fusion.

      In the quick header search we simply append vector search to keyword search results when keyword returns less than 4 results.

      > does all discourse instances have those features? for example, internals.rust-lang.org, do they use pgvector?

      Yes, all use PGvector. In our hosting all instances default to having the vector features enabled, we run embeddings using https://github.com/huggingface/text-embeddings-inference

  • dpflan 8 hours ago

    Thanks for the details. Also, always appreciated Discord's engineering blog posts. Lots of interesting stories, and nice to see a company discuss using Elixir at scale.