Comment by swores

Comment by swores 4 days ago

7 replies

Can anyone recommend an open source option that would allow training on a custom voice (my own, so I'd be able to record as many snippets as it needed to train on) to allow me to use it for TTS generation without sharing it off my machine?

Edit: I'll wait to see if any recommendations get made here, if not I might give this one a go: https://github.com/coqui-ai/TTS

hm64 4 days ago

Coqui is great, but in practice, I found Piper easier to set up, train, and deploy as an ONNX file. Big thanks to the Sherpa development team for their helpful resources: https://k2-fsa.github.io/sherpa/onnx/tts/piper.html and to the Rhasspy team for their training guide: https://github.com/rhasspy/piper/blob/master/TRAINING.md.

I also found DEMUCS + Whisper + pydub to be a super helpful combo for creating quality datasets.

numpad0 4 days ago

I think you can probably generate TTS audio by classical means, and voice2voice that audio through RVC or Beatrice V2. Haven't looked into it in a while but Beatrice is apparently super fast and CPU only.

esskay 4 days ago

If I recall Coqui is very much a dead project, just one to be aware of.