Comment by ianbicking
Comment by ianbicking a day ago
What's been your experience with the Realtime API? I've been doing LLM with voice, but haven't really given it a try – the price is so high, and it feels like it's much harder to control. Specifically that you just get one system prompt and then the model takes over entirely. (Though looking at the API, I see you can inject text and do some other things to play around with the session.)
I agree, it's still pricy. The cost works out better with `gpt-4o-mini-realtime-preview-2024-12-17`.
Yep its constrained to the system prompt but I pass in conversation history with each new session to keep it relevant. It also supports tool calling which is clutch.
Have you tried Hume AI? They've got a neat suite of APIs that give you more control on each session.