Comment by Patrick_Devine

Comment by Patrick_Devine a day ago

0 replies

I actually built something similar to this a couple days ago for finding duplicate bugs in our gh repo. Some differences:

* I used json to store the blobs in sqlite instead of converting it to byte form (I think they're roughly equivalent in the end?) * For the distances calculations I use `numpy.linalg.norm(a-b)` to subtract the two vectors and then take the normal * `ollama.embed()` and `ollama.generate()` will cut down on the requests code