Comment by sarchertech

Comment by sarchertech 2 days ago

The average mechanic won’t do something completely different to your car because you added some extra filler words to your request though.

The average user may not care exactly what the mechanic does to fix your car, but they do expect things to be repeatable. If car repair LLMs function anything like coding LLMs, one request could result in an oil change, while a similar request could end up with an engine replacement.

ethmarks 2 days ago

I think we're making similar points, but I kind of phrased it weirdly. I agree that current LLMs are sensitive to phrasing and are highly unpredictable and therefore aren't useful in AI-based backends. The point I'm making is that these issues are potentially solvable with better AI and don't philosophically invalidate the idea of a non-programmatic backend.

One could imagine a hypothetical AI model that can do a pretty good job of understanding vague requests, properly refusing irrelevant requests (if you ask a mechanic to bake you a cake he'll likely tell you to go away), and behaving more or less consistently. It is acceptable for an AI-based backend to have a non-zero failure rate. If a mechanic was distracted or misheard you or was just feeling really spiteful, it's not inconceivable that he would replace your engine instead of changing your oil. The critical point is that this happens very, very rarely and 99.99% of the time he will change your oil correctly. Current LLMs have far too high of a failure rate to be useful, but having a failure rate at all is not a non-starter for being useful.

Reply View 1 reply

sarchertech 2 days ago

All of that is theoretically possible. I’m doubtful that LLMs will be the thing that gets us to that though.
Even if it is possible, I’m not sure if we will ever have the compute power to run all or even a significant portion of the world’s computations through LLMs.

Reply View | 0 replies

array_key_first 2 days ago

Mechanics, and humans, are non-deterministic. Every mechanic works differently, because they have different bodies and minds.

LLMs are, of course, bad. Or not good enough, at least. But suppose they are. Suppose they're perfect.

Would I rather use an app or just directly interface with an LLM? The LLM might be quicker and easier. I know, for example, ordering takeout is much faster if I just call and speak to a person.

Reply View 2 replies

theendisney 2 days ago

Old people sometimes call rather than order on the website. They never fail to come up with a query that no amount of hardcoded logic could begin to attack.

Reply View | 0 replies
sarchertech 16 hours ago

> Every mechanic works differently, because they have different bodies and minds.
Yes but the same LLM works very differently on each request. Even ignoring non-determinism, extremely minor differences in wording that a human mechanic wouldn’t even notice will lead to wildly different answers.
> LLMs are, of course, bad. Or not good enough, at least. But suppose they are. Suppose they're perfect.
You’re just talking about magic at that point.
But suppose the do become “perfect”, I’m skeptical we’ll ever have the compute resources to replace a significant fraction of computation with LLMs.

Reply View | 0 replies