Comment by jnmandal
Looks like a really cool project. Do you have any opinions on which transcription models are the best, from a quality perspective? I have heard a lot of mixed opinions on this. Curious what you've found in your development process?
I'm a huge fan of using Whisper hosted on Groq since the transcription is near instantaneous. ElevenLabs' Scribe model is also particularly great with accuracy, and I use it for high-quality transcriptions or manually upload files to their API to get diarization and timestamps (https://elevenlabs.io/app/speech-to-text). That being said, I'm not the biggest expert on models. In my day-to-day workflow, I usually swap between Whisper C++ for local transcription or Groq if I want the best balance of speed/performance, unless I'm working on something particularly sensitive.