chowells 2 months ago

That's the wrong distinction, and bringing it up causes pointless arguments like are in the replies.

The right distinction is that assemblers and compilers have semantics and an idea of correctness. If your input doesn't lead to a correct program, you can find the problem. You can examine the input and determine whether it is correct. If the input is wrong, it's theoretically possible to find the problem and fix it without ever running the assembler/compiler.

Can you examine a prompt for an LLM and determine whether it's right or wrong without running it through the model? The idea is ludicrous. Prompts cannot be source code. LLMs are fundamentally different from programs that convert source code into machine code.

This is something like "deterministic" in the colloquial sense, but not at all in the technical sense. And that's where these arguments come from. I think it's better to sidestep them and focus on the important part: compilers and assemblers are intended to be predictable in terms of semantics of code. And when they aren't, it's a compiler bug that needs to be fixed, not an input that you should try rephrasing. LLMs are not intended to be predictable at all.

So focus on predictability, not determinism. It might forestall some of these arguments that get lost in the weeds and miss the point entirely.

traverseda 2 months ago

LLMs are deterministic. So far every vendor is giving them random noise in addition to your prompt though. They don't like have a free will or a soul or anything, you feed them exactly the same tokens exactly the same tokens will come out.

  • mmoskal 2 months ago

    If you change one letter in the prompt, however insignificant you may think it is, it will change the results in unpredictable ways, even with temperature 0 etc. The same is not true of renaming a variable in a programming language, most refactorings etc.

  • jnwatson 2 months ago

    Only if you set temperature to 0 or have some way to set the random seed.

    • vlovich123 2 months ago

      Locally that’s possible but for multi tenant ones I think there’s other challenges related to batch processing (not in terms of the random seed necessarily but because of other non determinism sources).

  • codr7 2 months ago

    That's not how they are being used though, is it?

pjmlp 2 months ago

Missed the part?

> Most likely we will still need some kind of formalisation tools to tame natural language uncertainties, however most certainly they won't be Python/Rust like

  • Wowfunhappy 2 months ago

    No, I didn't miss it. I think the fact that LLMs are non deterministic means we'll need a lot more than "some kind of formalization tools", we'll need real programming languages for some applications!

    • pjmlp 2 months ago

      How deterministic are C compilers at -O3, while compiling exactly the same code across various kinds of vector instructions, and GPUs?

      We are already on the baby steps down that path,

      https://code.visualstudio.com/docs/copilot/copilot-customiza...

      • spookie 2 months ago

        Take a look at the following: https://reproduce.debian.net/

        Granted, lot's of different compilers and arguments depending on packages. But you need to match this reproducibility in a fancy pants 7GL

      • almostgotcaught 2 months ago

        You moved the goal posts and declared victory - that's not what deterministic means. It means same source, same flags, same output. Under that definition, the actual definition, they're 99.9% deterministic (we strive for 100% but bugs do happen).