Comment by Lienetic

Comment by Lienetic 6 days ago

4 replies

Very cool, congrats on the launch! What's your plan for when one of the larger players like ElevenLabs or Google adds support for these languages? I would guess the reason why they haven't is because they don't see a large opportunity. How are you thinking about it?

muhammadbsabir 6 days ago

Thanks! You’re right, the big players mostly ignore these languages. The additional challenge is the lack of online data, so we spend a lot of effort on data collection and labeling on the ground.

Also companies like ElevenLabs, and Deepgram have done well by focusing on specific use cases, even when the big labs are amazing at English.

Right now these languages are underserved, so there’s a window to build the best models for these languages.

hammadmlk 6 days ago

I think the Voice Models market will be like eCommerce. There will be no global winner instead a few regional winners -- each being really big.

We plan to be one of those winners.

  • chirau 6 days ago

    What does it take to build such a model? As in, the key steps. And how expensive does it get? I might be interested in being a regional player and winner as well, lol. In my own corner of the world in Africa.

    • hammadmlk 6 days ago

      Not much... Just the willingness to work hard on this problem instead of others problems where large revenue is perhaps quicker :)

      Ingredients: Decent audio scraping skills, hiring great voice actors for each language, algos to gather text/audio with diverse phonetics, decent ML skills (enough to merge the best features of a few different papers). Lots and lots of data labels (and your own tools to get the data labeled efficiently) And finally GPUs!!!!

      None of this is technically hard... the hardest thing is working with Voice Actors (oh man!!!)