Comment by andai

Comment by andai 10 months ago

6 replies

Interesting, I haven't used Whisper, is it cost effective? Seems to be about 36 cents per (hour long) video? How long does processing take?

kajecounterhack 10 months ago

You can run it locally, and it's really fast. But since YouTube transcription is really good, I don't see why you'd use Whisper and get a worse transcription (unless maybe it's on videos that Google did not transcribe for whatever reason).

  • gs17 10 months ago

    > But since YouTube transcription is really good

    Are you sure you're looking at automatic transcripts? YouTube transcripts are bizarrely low quality if they're not provided by the creators (I've actually used my Google Pixel's live transcription to make better captions occasionally).

    I just checked a video my girlfriend uploaded a week ago and the auto-transcript was still pretty messy. I've used Whisper for the same task and it's significantly better.

    • jokethrowaway 10 months ago

      That's crazy, months ago I compared whisper v2 transcripts with YouTube transcripts generated on my video and found them to be identical, down to the timestamps.

      I know people who upload a video on YouTube unlisted just to get transcripts generation for free and then delete the video.

    • ofou 10 months ago

      Agreed. However, you can get great YT transcriptions using GPT-4o mini to clean them up.

HPsquared 10 months ago

36 cents an hour is how much it costs to hire an entire GPU like an A4000. I can assure you Whisper runs much, much faster than 1x!

  • jokethrowaway 10 months ago

    A few derivative projects are faster than 1x, insanely-fast-whisper being the fastest I've tried.

    whisper v3 large on release day was around 1x on a 4090