Comment by Patrick_Devine

Comment by Patrick_Devine 10 months ago

If you don't want to make direct API calls, there are actual official Ollama python bindings[1]. Cool project though!

[1] https://github.com/ollama/ollama-python

punnerud 10 months ago

Nice, thanks for the feedback. I have a prototype of also using the embeddings for categorizing the steps, with "tags/labels". Almost take it as a challenge to be able to reason better with a smaller modell than those >70B that you can not run on your own laptop.

Reply View 3 replies

Patrick_Devine 10 months ago

I actually built something similar to this a couple days ago for finding duplicate bugs in our gh repo. Some differences:
* I used json to store the blobs in sqlite instead of converting it to byte form (I think they're roughly equivalent in the end?) * For the distances calculations I use `numpy.linalg.norm(a-b)` to subtract the two vectors and then take the normal * `ollama.embed()` and `ollama.generate()` will cut down on the requests code

Reply View | 0 replies
homarp 10 months ago

Can you use https://github.com/abetlen/llama-cpp-python or you need something ollama provide ?
speaking of embeddings, you saw https://jina.ai/news/jina-embeddings-v3-a-frontier-multiling... ?

Reply View | 1 reply
- punnerud 10 months ago
  
  Switching to a low level integration will probably not improve the speed, the waiting is primarily on the llama generation of text.
  Should be easy to switch embeddings.
  Already playing with adding different tags to previous answers using embeddings, then using that to improve the reasoning.
  
  Reply View | 0 replies