Comment by Reubend

Comment by Reubend a day ago

4 replies

Let me offer some feedback, since almost all of the comments here are negative. The latency is very good, almost too good since it seems to interrupt me often. So I think that's a great achievement for an open source model.

However, people here have been spoiled by incredibly good LLMs lately. And the responses that this model gives are nowhere need the high quality of SOTA models today in terms of content. It reminds me more of the 2019 LLMs we saw back in the day.

So I think you've done a "good enough" job on the audio side of things, and further focus should be entirely on the quality of the responses instead.

08d319d7 21 hours ago

Wholeheartedly agree. Latency is good, nice tech (Rust! Running at the edge on a consumer grade laptop!). I guess a natural question is: are there options to transplant a “better llm” into moshi without degrading the experience.

  • aversis_ 18 hours ago

    But tbh "better" is subjective here. Does the new LLM improve user interactions significantly? Seems like people get obsessed with shiny new models without asking if it’s actually adding value.

  • Kerbonut 14 hours ago

    With flux, they have been able to separate out the unet. I wonder if something similar could be done here so parts of it can be swapped.