Comment by simonw

Comment by simonw 6 months ago

I use ffmpeg multiple times a week thanks to LLMs. It's my top use-case for my "llm cmd" tool:

  uv tool install llm
  llm install llm-cmd

  llm cmd use ffmpeg to extract audio from myfile.mov and save that as mp3

https://github.com/simonw/llm-cmd

resonious 6 months ago

I tried this (though with a different tool called aichat) for extremely simple stuff like just "convert this mov to mp4" and it generated overly complex commands that failed due to missing libraries. When I removed the "crap" from the commands, they worked.

So much like code assistance, they still need a fair amount of baby sitting. A good boost for experienced operators but might suck for beginners.

Reply View 50 replies

cm2187 6 months ago

Plus you need to know the format of your source file to design the command correctly. How many audio tracks, is the first video track a thumbnail or the video, are the subtitles tracks forced, etc.
And in some situations ffmpeg has some warts you have to go around. Like they introduced recently a moronic change of behaviour where the first sub tracks becomes forced/default irrespective of the original forced/default flag of the source. You need to add "-default_mode infer_no_subs" to counter that.

Reply View | 1 reply
- Philpax 6 months ago
  
  I usually just paste the output of `ffprobe` into Claude when it's ambiguous. Works a treat.
  
  Reply View | 0 replies
Over2Chars 6 months ago

My feelings exactly, but I think that's OK!
It's another tool and one that might actually improve with time. I don't see GNU's man pages getting any better spontaneously.
Whoa, what if they started to use AI to auto-generate man pages...

Reply View | 5 replies
- BlaDeKke 6 months ago
  
  > Whoa, what if they started to use AI to auto-generate man pages...
  That’s the time to start my career in woodworking.
  
  Reply View | 3 replies
  
  johnisgood 6 months ago
  
  I already generate man pages (and POD) with Claude for my new projects. :D
  It works really well.
  
  Reply View | 2 replies
- michaelcampbell 6 months ago
  
  > what if they started to use AI to auto-generate man pages...
  Then they'd be wrong about 20% of the time, and still no one would read them. ;-)
  (NB: I'm of the age that I do read them, but I think I'm in the minority.)
  
  Reply View | 0 replies
BiteCode_dev 6 months ago

Reading this feels like seing a guy getting his first car in 1920 and complaining he still has to drive it himself.

Reply View | 7 replies
- imiric 6 months ago
  
  To me it's more like a guy getting his first car and complaining that the car is driving him in a direction that may or may not be correct, despite his best efforts to steer it where he wants to go. And the only way to know whether he ends up in the right place is to get out of the car, look around, and maybe ask more experienced drivers. Failing that, his only option is to get back in and hope to be luckier in the next trip.
  Or he can just ditch the car and walk. Sure, it's slower and requires more effort, but he knows exactly how to do that and where it will take him.
  
  Reply View | 0 replies
- LeoWattenberg 6 months ago
  
  The beer brewers in my home town used to have a self-driving horse and cart which knew the daily delivery route going by all pubs and didn't really need a human to steer it or indeed be conscious during the trip. Expectedly, the delivery guy would get drunk first thing in the morning and just get carted about collecting the money.
  
  Reply View | 0 replies
- pbhjpbhj 6 months ago
  
  Pony & trap could be largely self-driving, after an initial training period. That would have been a distinct negative to "upgrading" for some, I'd imagine.
  
  Reply View | 1 reply
  
  nine_k 6 months ago
  
  It's speed and load capacity vs self-driving.
  If we could imagine wiring a pony to control a car, its brain, while good at navigation, would likely be inadequate at the speed that a car attains.
  
  Reply View | 0 replies
- atoav 6 months ago
  
  Sell that guy probably got carried home by his horse after drinking half a bottle of whiskey, so maybe he had a point.
  
  Reply View | 0 replies
- [removed] 6 months ago
  
  [deleted]
  
  Reply View | 0 replies
- shriek 6 months ago
  
  Or maybe calling a cab and telling the cab driver each direction to get to the destination instead of the cab driver just taking you there.
  
  Reply View | 0 replies
assimpleaspossi 6 months ago

My experience exactly.
I no longer check with these AI tools after a number of attempts. Unrelated, a friend thought there was a NFL football game last Saturday at noon. Checking with Google's Gemini, it said "no", but there was one between two teams whose season had ended two weeks before at 1:00 Eastern Time and 2:00 Central. (The times are backwards.)

Reply View | 4 replies
- bityard 6 months ago
  
  Do LLMs have knowledge of current events?
  
  Reply View | 3 replies
  
  hulitu 6 months ago
  
  > Do LLMs have knowledge of current events?
  I don't think the notion of "current" has been explained to them. Thay just define it out of context.
  
  Reply View | 0 replies
  
  assimpleaspossi 6 months ago
  
  Meta.ai got it right. The free chatGPT only has data up till 2021 or something like that.
  
  Reply View | 0 replies
  
  johnisgood 6 months ago
  
  I mean, some are capable of searching the web.
  Ask them about the fire in LA in 2025 January.
  
  Reply View | 0 replies
sdesol 6 months ago

> "convert this mov to mp4"
Did any of the commands look like the ones in the left window:
https://beta.gitsense.com/?chats=12850fe4-ffb1-4618-9215-c13...
The left window contains a summary of all the LLMs asked, including all commands. The right window contains the individual LLM responses.
I asked about gotchas with missing libraries as well, and Sonnet 3.5 said there were. Were these the same libraries that were missing for you?

Reply View | 8 replies
- resonious 6 months ago
  
  Looking at this, I am pretty sure I also received a "libx264" clause. Removing it made the command work for me.
  
  Reply View | 7 replies
  
  jack_pp 6 months ago
  
  libx264 is the best encoder for h264 ffmpeg has to offer so it's pretty important you bundle it in your ffmpeg install. Those commands are perfectly standard, I've been using something like that for 10+ years
  
  Reply View | 0 replies
  
  sdesol 6 months ago
  
  I don't disagree that we need to be cautious with LLMs, but I've personally stopped asking GPT-4/GPT-4 mini for technical answers. Sonnet 3.5 and DeepSeek V3 (which is much cheaper but still not as good as Sonnet) are your best bet for technical questions.
  Where I find GPT to perform better than Sonnet is with text processing. GPT seems to better understand what I want when it comes to processing documents.
  I'm convinced that no LLM provider has created or will create a moat, and that we will always need to shop around for an answer.
  
  Reply View | 5 replies
keeganpoppen 6 months ago

what exactly do you want the llm to do here? if the ask was so unambiguous and simple that it could be reliably generated, then the interface wouldn't be so complicated to use in the first place! LLMs are not in any way best suited for one-shot prompt => perfect output, and expectations to that effect are extremely unreasonable. the reason why LLMs are still hard for beginners to use is because the software is hard to use correctly. as with LLM output goes life itself: the results you get from using a tool can only ever be as good as the (mental) model used to choose that tool & the inputs to begin with. if all the information required to generate the output were contained by the initial prompt, then there would be absolutely no need to use the LLM at all in the first place.

Reply View | 0 replies
Philpax 6 months ago

Hate to be that guy, but which LLM was doing the generation? GPT-4 Turbo / Claude 3.x have not really let me down in generating ffmpeg commands - especially for basic requests - with most of their failures resulting from domain-specific vagaries that an expert would need to weigh in on m

Reply View | 17 replies
- resonious 6 months ago
  
  GPT-4
  
  Reply View | 5 replies
  
  Philpax 6 months ago
  
  Fair enough. If you remember what you were testing with, I'd love to try it again to see if things are better now.
  
  Reply View | 4 replies
- th0ma5 6 months ago
  
  Hate to be that guy, but which model works without fail for any task that ffmpeg can do?
  
  Reply View | 10 replies
  
  iameli 6 months ago
  
  "Writing working commands first try for every single ffmpeg feature that exists" is the highest bar I've ever heard of, I love it. I'm gonna start listing it as a requirement on job postings. Like an ffmpeg speedrun.
  
  Reply View | 3 replies
  
  Philpax 6 months ago
  
  I don't think there's a single human on or outside of this planet that can meet that requirement, but Claude has been pretty good to me. It's certainly a much better starting point than pouring over docs and SO posts.
  
  Reply View | 1 reply
  
  th0ma5 6 months ago
  
  In my experience you still get a lot of stuff that used to work or stuff that it just makes up.
  
  Reply View | 0 replies
  
  AuryGlenz 6 months ago
  
  I know I struggled on getting a good command to “simply” make the videos from my Z8 smaller (in file size).
  Usually the color was wrong and I don’t care enough to learn about colorspaces to figure out how to fix it and it’s utterly insane how difficult it is even with LLMs.
  Just reencode it as is but a little more lossy. Is that so hard?
  
  Reply View | 1 reply
  
  latexr 6 months ago
  
  Handbrake may be a better option for you. I find that for some tasks it’s not only simpler but straight up works better than FFmpeg.
  https://handbrake.fr/docs/en/latest/cli/cli-options.html
  
  Reply View | 0 replies
  
  bloqs 6 months ago
  
  This doesnt exist in reality so in one sense, you could challenge the relevance
  
  Reply View | 1 reply
  
  th0ma5 6 months ago
  
  I think in the non LLM world though you at least have the trail of documentation you can unwind once you're in a bind. I don't care for prompt-a-mole fighting.
  
  Reply View | 0 replies
Melomomololo 6 months ago

[dead]

Reply View | 0 replies

pmarreck 6 months ago

A while back I simply wrote my own bash function for this called `please`

as in

    bash> please "use ffmpeg to extract audio from myfile.mov and save it as mp3"

It will then courteously show you the command it wants to run before you agree to do it.

Here is the whole thing, with its two dependent functions, so that people stop writing their own versions of this lol. All it needs is an OPENAI_API_KEY, feel free to modify for other LLMs

EDIT: Moved to a gist: https://gist.github.com/pmarreck/9ce17f7996347dd532f3e20a2a3...

Suggestions welcome- for example I want to add a feature that either just copies it (for further modification) or prepopulates the command line with it somehow (possibly for further modification, or even for skipping the approval step)

Reply View 1 reply

smusamashah 6 months ago

please is such an appropriate name. Will rename my ChatGPT alias to please.

Reply View | 0 replies

atoav 6 months ago

Did you just invent the LLM-equivalent of curl-piping unread shell scripts into sh?

I am sure that will never cause any problems.

Reply View 3 replies

bspammer 6 months ago

It displays the generated command to you, there's an additional step to confirm.

Reply View | 1 reply
- atoav 6 months ago
  
  Ah good to clarify, thanks
  
  Reply View | 0 replies
ykonstant 6 months ago

> Did you just invent the LLM-equivalent of curl-piping unread shell scripts into sh?
Many such cases.

Reply View | 0 replies

dekhn 6 months ago

"The future is already here. It's just not very well distributed"

(honestly, the work you share is very inspiring)

Reply View 0 replies

zahlman 6 months ago

>This will then be displayed in your terminal ready for you to edit it, or hit <enter> to execute the prompt. If the command doesnt't look right, hit Ctrl+C to cancel.

I appreciate the UI choice here. I have yet to do anything with AI (consciously and deliberately, anyway) but this sort of thing is exactly what I imagine as a proper use case.

Reply View 1 reply

hnuser123456 6 months ago

Just like all other code. There will be user-respecting open source code and tools, and there's user-disrespecting profitable closed code that makes too many decisions for you.

Reply View | 0 replies

mvonballmo 6 months ago

Hypertalk <https://en.wikipedia.org/wiki/HyperTalk> lives.

Reply View 0 replies

Beijinger 6 months ago

uv?

Reply View 1 reply

phrotoma 6 months ago

like pip but written in rust

Reply View | 0 replies

th0ma5 6 months ago

You should figure out what went wrong for the other commenter and fix your tool.

Reply View 0 replies

mrweasel 6 months ago

While I love that that works, I still feel like just maybe ffmpeg needs a better interface. Not necessarily a GUI, just a better designed command line.

Reply View 0 replies

Waterluvian 6 months ago

I think I’m finally sold on actually attempting to add some LLM to my toolbelt.

As a helper and not a replacement, this sounds grand. Like the most epic autocomplete. Because I hate how much time I waste trying to figure out the command line incantation when I already know precisely what I want to do. It’s the weakest part of the command line experience.

Reply View 8 replies

Over2Chars 6 months ago

But possibly the most rewarding. The struggle is its own reward that pays off later many times over.

Reply View | 7 replies
- hbn 6 months ago
  
  There are times I feel minor guilt for using an LLM to relieve brainwork, like figuring out an algorithm. That's probably a skill I should continue practicing for my own sake.
  ffmpeg commands though? It's really not a practical skill outside of using ffmpeg. There's nothing really rewarding to me about memorizing awkwardly designed CLI incantations. It's all arbitrary.
  
  Reply View | 2 replies
  
  Over2Chars 6 months ago
  
  memorization is not what I'm talking about.
  I'm talking about
  1. you have a problem, you try something and it doesn't work. 2. you find an LLM and it "gives" you the answer with one or two tries 3. problem solved! what have you learned? How to have answers given to you when you ask.
  or 2. you look for an answer in a dizzying haze of man pages, quacks, and website Q&As. 3. you try and try again and eventually problem solved. you have learned not only how to solve a particular problem but your overall ability to solve similar problems has done up. You've learned how to fish, not just ask an LLM for a fish.
  
  Reply View | 1 reply
  
  chriscappuccio 6 months ago
  
  after well more than three decades of ice fishing, i am absolutely delighted to ask an LLM for an ffmpeg fish
  and yet, i too know that it is perhaps not quite as good for me as finding the fish on my own lake
  at least i can keep these fish in my own warm pond, and go back to them whenever i like
  after decades of ice fishing, i think the LLM fish are quite good... and even when they are no good, that fishing experience makes it so easy to go back to the LLM and get the exact right fish for my pond
  it will continue to be helpful for everyone in the future if we keep publishing the contents of our ponds, whether it's a web site like this, or in a repository
  
  Reply View | 0 replies
- Waterluvian 6 months ago
  
  Not for me. It’s a tool I don’t care to use any more than I have to. I’m much more interested in what I’m using the tool to accomplish.
  
  Reply View | 3 replies
  
  Over2Chars 6 months ago
  
  I am not talking about the tool per se. I am talking about the skill of persistence and creativity in the face of a problem.
  Learning a tool is useful, even invaluable, but if you don't have the persistence to use it, it's useless.
  And many tools are just partially useful under some conditions. So creativity in using them is also useful.
  So it's not about the tools, its about not giving up and trying different things, which makes all tools more effective, and problem-solving more likely.
  
  Reply View | 2 replies