Comment by jonsoft

Comment by jonsoft 5 days ago

4 replies

I asked the Spanish tutor if he/it was familiar with the terms seseo[0] and ceceo[1] and he said it wasn't, which surprised me. Ideally it would be possible to choose which Spanish dialect to practise as mainland Spain pronunciation is very different to Latin America. In general it didn't convince me it was really hearing how I was pronouncing words, an important part of learning a language. I would say the tutor is useful for intermediate and advanced speakers but not beginners due to this and the speed at which he speaks.

At one point subtitles written in pseudo Chinese characters were shown; I can send a screenshot if this is useful.

The latency was slightly distracting, and as others have commented the NVIDIA Personaplex demos [2] are very impressive in this regard.

In general, a very positive experience, thank you.

[0] https://en.wikipedia.org/wiki/Phonological_history_of_Spanis... [1] https://en.wikipedia.org/wiki/Phonological_history_of_Spanis... [2] https://research.nvidia.com/labs/adlr/personaplex/

andrew-w 5 days ago

Thanks for the feedback. The current avatars use a STT-LLM-TTS pipeline (rather than true speech-to-speech), which limits nuanced understanding of pronunciations. Speech-to-speech models should solve this problem. (The ones we've tried so far have counterintuitively not been fast enough.)

sid-the-kid 5 days ago

ooof. You saw the Chinese text. Yup, that's super annoying. We are trying to squash that hallucination.

Thanks for the feedback! That's helpful!

  • Terretta 4 days ago

    the chinese text happened last night in your main chat agent widget, the cartoon woman professing to be in a town in brazil with a lemon tree on her cupboard. she claimed it was a test of subtitling then admitted it wasn't.

    btw, she gives helpful instructions like "/imagine" whatever but the instructions only seem to work about 50% of the time. meaning, try the same command or variants a few times, and it works about half of them. she never did shift out of aussie accent though.

    she came up with a remarkably fanciful explanation why as a brazilian she sounded aussie and why imagining native accent like she said would work didn't...

    i was shocked when /imagine face left turn to the side did actually work, the agent was in side profile and precisely as natural as the original front facing avatar

    all in all, by far the best agent experience i've played with!

    • andrew-w 4 days ago

      So glad you enjoyed it! We've been able to significantly reduce those text hallucinations with a few tricks, but it seems they haven't been fully squashed. The /imagine command only works with the image at the moment, but we'll think about ways to tie that into the personality and voice. Thanks for the feedback!