Comment by simonw

Comment by simonw a day ago

6 replies

Anyone seen a low-friction way to run prompts through this yet, either via a hosted API or chat UI or a convenient GGML or MLX build that runs in Ollama or llama.cpp or LM Studio?

throwaway18875 21 hours ago

Currently running it using LM Studio. It can download it from Hugging Face. It generates incoherent text though

===

You:

I pray you, who is this Master Newton?

timecapsulellm-v2-1800-1875-mlx:

TI offer to pay you the very same fee as you did before. It was not in the power of your master to deliver the letter to your master. He did. I will be with you as soon as I can keep my word. It is not at all clear, whether the letter has been sent or not. It is not at all clear: but it is clear also that it was written by the person who gave it. "No," I said, "I cannot give it to you." There, the letter was sent to me. "The letter is yours, I believe," I said. "But, I hope, you will not refuse to give it to me?

t1amat a day ago

Not a direct answer but it looks like v0.5 is a nanoGPT arch and v1 is a Phi 1.5 arch, which should be well supported by quanting utilities for any engine. They are small too and should be able to be done on a potato.

alansaber a day ago

I too have completely forgotten how the adapters library works and would have appreciated a simple inference script