Comment by Versipelle

Comment by Versipelle 2 days ago

13 replies

This is really impressive; we're getting close to a dream of mine: the ability to generate proper audiobooks from EPUBs. Not just a robotic single voice for everything, but different, consistent voices for each protagonist, with the LLM analyzing the text to guess which voice to use and add an appropriate tone, much like a voice actor would do.

I've tried "EPUB to audiobook" tools, but they are really miles behind what a real narrator accomplishes and make the audiobook impossible to engage with

mclau157 2 days ago

Realistic voice acting for audio books, realistic images for each page, realistic videos for each page, oh wait I just created a movie, maybe I can change the plot? Oh wait I just created a video game

  • hleszek 2 days ago

    Now do it in VR and make it fully interactive.

azinman2 2 days ago

Wouldn’t it be more desirable to hear an actual human on an audiobook? Ideally the author?

  • Versipelle 2 days ago

    > Wouldn’t it be more desirable to hear an actual human on an audiobook? Ideally the author?

    Of course, but it's not always available.

    For example, I would love an audiobook for Stanisław Lem's "The Invincible," as I just finished its video game adaptation, yet it simply doesn't exist in my native language.

    It's quite seldom that the author narrates the audiobooks I listen to, and sometimes the narrator does a horrible job, butchering the characters with exaggerated tones.

  • satvikpendem 2 days ago

    Why a human? There are many cases where I like a book but dislike the audiobook speaker, so I essentially can't listen to that book anymore. With a machine, I can tweak the voice to my heart's content.

    • iamsaitam 2 days ago

      And get a completely wrong/bland but custom read of the book. Reading is much more than simply transforming text to audio.

      • satvikpendem a day ago

        Sometimes, I don't care if it's bland, I just want to listen to the text. There are a lot of Asian light novels for example which never get English audiobooks, and I've listened to many of them with basic TTS, not even an AI model TTS like these more recent ones, and I thoroughly enjoyed these books even still.

  • fennecfoxy 5 hours ago

    It'd be nice if there were mainstream releases on GBC/GBA/PSP again too! But apparently if there's no money in something then people don't really wanna do it.

  • ks2048 2 days ago

    With 1M+ new books every year, that’s not possible for all but the few most popular.

  • cchance 2 days ago

    You really think people writing these papers actually have good speaking voices? LOL, theirs a reason not everyone could be an audio book maker or podcaster, a lot of peoples voices suck for audiobooks

  • senordevnyc 2 days ago

    Honestly, I’d say that’s true only for the author. Anyone else is just going to be interpreting the words to understand how to best convey the character / emotion / situation / etc., just like an AI will have to do. If an AI can do that more effectively than a human, why not?

    The author could be better, because they at least have other info beyond the text to rely on, they can go off-script or add little details, etc.

    • DrSiemer 2 days ago

      As somebody who has listened to hundreds of audiobooks, I can tell you authors are generally not the best choice to voice their own work. They may know every intent, but they are writers, not actors.

      The most skilled readers will make you want to read books _just because they narrated them_. They add a unique quality to the story, that you do not get from reading yourself or from watching a video adaptation.

      Currently I'm in The Age of Madness, read by Steven Pacey. He's fantastic. The late Roy Dotrice is worth a mention as well, for voicing Game of Thrones and claiming the Guinness world record for most distinct voices (224) in one series.

      It will be awesome if we can create readings automatically, but it will be a while before TTS can compete with the best readers out there.

      • azinman2 2 days ago

        I’d suggest even if the TTS sounded good, I’d still rather a human because:

        1. It’s a job that seems worthwhile to support, especially as it’s “practice” that only adds to a lifetime of work and improves their central skill set

        2. A voice actor will bring their own flare, just like any actor does to their job

        3. They (should) prepare for the book, understanding what it’s about in its entirety, and bring that context to the reading