Comment by xfalcox

Comment by xfalcox 11 hours ago

3 replies

In Discourse embeddings power:

- Related Topics, a list of topics to read next, which uses embeddings of the current topic as the key to search for similar ones

- Suggesting tags and categories when composing a new topic

- Augmented search

- RAG for uploaded files

nextaccountic 5 hours ago

what does the rag for uploaded files do in discourse?

also, when i run a discourse search does it really do both a regular keyword search and a vector search? how do you combine results?

does all discourse instances have those features? for example, internals.rust-lang.org, do they use pgvector?

  • xfalcox 30 minutes ago

    > what does the rag for uploaded files do in discourse?

    You can upload files that will act as RAG files for an AI bot. The bot can also have access to forum content, plus the ability to run tools in our sandboxed JS environment, making it possible for Discourse to host AI bots.

    > also, when i run a discourse search does it really do both a regular keyword search and a vector search? how do you combine results?

    Yes, it does both. In the full page search it does keyword first, then vector asynchronously, which can be toggled by the user in the UI. It's auto toggled when keyword has zero results now. Results are combined using reciprocal rank fusion.

    In the quick header search we simply append vector search to keyword search results when keyword returns less than 4 results.

    > does all discourse instances have those features? for example, internals.rust-lang.org, do they use pgvector?

    Yes, all use PGvector. In our hosting all instances default to having the vector features enabled, we run embeddings using https://github.com/huggingface/text-embeddings-inference

dpflan 8 hours ago

Thanks for the details. Also, always appreciated Discord's engineering blog posts. Lots of interesting stories, and nice to see a company discuss using Elixir at scale.