Comment by Eisenstein

Comment by Eisenstein 2 days ago

0 replies

Download koboldcpp and llama3.1 gguf weights, use it with the llama3 completions adapter.

Edit the 'background.js' file in the extension and replace the openAI endpoint with

'http://your.local.ip.addr:5001/v1/chat/completions'

Set anything you want as an API key. Now you have a truly local version.

* https://github.com/LostRuins/koboldcpp/releases

* https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-...

* https://github.com/LostRuins/koboldcpp/blob/concedo/kcpp_ada...