Comment by punnerud

Comment by punnerud a day ago

0 replies

Switching to a low level integration will probably not improve the speed, the waiting is primarily on the llama generation of text.

Should be easy to switch embeddings.

Already playing with adding different tags to previous answers using embeddings, then using that to improve the reasoning.