Comment by samat

Comment by samat 19 hours ago

How hard is it to make TTS out of this? A few independent journalists from Belarus asked for TTS in their language, but I am no expert, was thinking about re-using Mozilla's work. What's the easiest way to get working TTS for a language?

woodson 17 hours ago

EDIT: My bad, please disregard; As akreal pointed out, the MMS TTS models aren’t using the SSL models.

Original post:

You can use the OmniASR SSL models instead of their older MMS models to create TTS models: https://github.com/ylacombe/finetune-hf-vits

Reply View 3 replies

akreal 16 hours ago

As far as I understand, the MMS TTS models are trained from scratch (section 7.1 of [1]), they do not employ any SSL models. So the OmniASR SSL models are not useful here.
What might be interesting is the newly released OmniASR data, because the MMS data, which was used for the MMS TTS, was never released.
Also, the OmniASR can be used to transcribe some untranscribed speech to train a TTS on it.
[1] MMS paper: https://arxiv.org/pdf/2305.13516

Reply View | 1 reply
- woodson 15 hours ago
  
  You’re completely right, I misremembered. I edited my post.
  
  Reply View | 0 replies
willwade 16 hours ago

Meta cheated with the mms models. That is they didn’t use a phonemeizsr step. This means they just won’t work or sound very strange. ASR data is usually not quite right for tts. But anyhow - not really answering your question but many of these languages already done in mms. Try them https://huggingface.co/spaces/willwade/sherpa-onnx-tts

Reply View | 0 replies

kulahan 18 hours ago

From TFA, it says that it’s extremely easy to add new languages with just a few examples. I didn’t see specifics on how “few” it really is, though.

Reply View 1 reply

nl 17 hours ago

This is ASR not TTS though.

Reply View | 0 replies