Comment by andai
Comment by andai 3 days ago
Very nice. I made a thing in Python which summarizes a YouTube transcript in bullet points. Never thought about asking it questions, that's a great idea!
I just run yt-dlp to fetch the transcript and shove it in the GPT prompt. (I think also have a few lines to remove the timestamps, although arguably those would be useful to keep.)
My prompt is "{transcript} Please summarize the above in bullet points"
The trick was splitting it up into overlapping chunks so it fits in the context size. (And then summarizing your summary because it ends up too long cause you had so many chunks!)
These days that's not so important, usually you can shove an entire book in! (Unless you're using a local model, which still have small context sizes, work pretty well for summarization.)
I also built something similar using yt-dlp and llm CLI and wrote a post about it https://shekhargulati.com/2024/07/30/building-a-youtube-vide.... Script here https://github.com/shekhargulati/llm-tools/blob/main/yt-summ...