Comment by pzo

Comment by pzo a day ago

2 replies

Kokoro gives great results especially when speaking english. Model is small enough to run even on smartphone ~3x faster than realtime.

miki123211 11 hours ago

Kokoro just proves my point; it's "one guy in a garage", 1000 hours of distilled audio (I think) and ~100m params.

With the budget one tenth that of Stable Diffusion and less ethical qualms, you could easily 10x or 100x this.

bavell a day ago

Another +1 to Kokoro from me, great quality with good speed.