Comment by DHRicoF

Comment by DHRicoF 2 days ago

1 reply

You need to provide more information.

Is your data organized or is just a dump of unrelated content?

- If you have a bag of files without any metadata the best option is to create something like a RAG, with a pre OCR step for image files (or even some multimodal model call).

- If the content is well organized with a logic structure an agent could extract information with a little look around.

Is static or varies day by day?

- If is static you could index all at once, if not, an agent that pick what to reindex would be a better call.

I'm not aware of a solution like this, but seems doable as an MCP server. But the cost will scale quiclky.

oblio a day ago

I want the LLM to search my hard drives, including for file contents.

I have zounds of old invoices, spreadsheets created to quickly figure something out, etc.

I'd also want the tool to run in the background to update the index.

I've found something potentially interesting:

https://anythingllm.com/