Comment by almostgotcaught

Comment by almostgotcaught 18 hours ago

5 replies

You moved the goal posts and declared victory - that's not what deterministic means. It means same source, same flags, same output. Under that definition, the actual definition, they're 99.9% deterministic (we strive for 100% but bugs do happen).

pjmlp 18 hours ago

Nope the goal stayed at the same position, people argue for deterministic results while using tools that by definition aren't deterministic unless a big chunck of work is done ensuring that it is indeed.

"It means same source, same flags, same output", it suffices to change the CPU and the Assembly behaviour might not be the same.

  • sitkack 15 hours ago

    You keep being you, but you also have to admit, not only do you move goal posts, but most of arguments are on dollies, performing elaborate choreographies that would make Merce Cunningham blush.

    • fulafel 14 hours ago

      pjmlp did originally say "compiling exactly the same code across various kinds of vector instructions, and GPUs".

  • ModernMech 14 hours ago

    You have a point, but in making it I think you're undermining your argument.

    Yes, it's true that computer systems are nondeterministic if you deconstruct them enough. Because writing code for a nondeterministic machine is fraught with peril, as an industry we've gone to great lengths to move this nondeterminism as far away from programmers as possible. So they can at least pretend their code code is executing in a deterministic manner.

    Formal languages are a big part of this, because even though different machines may execute the program differently, at least you and I can agree on the meaning of the program in the context of the language semantics. Then we can at least agree there's a bug and try to fix it.

    But LLMs bring nondeterminism right to the programmer's face. They make writing programs so difficult that people are inventing new formalisms, "prompt engineering", to deal with them. Which are kind of like a mix between a protocol and a contract that's not even enforced. People are writing full-on specs to shape the output of LLMs, taking something that's nondeterministic and turning into something more akin to a function, which is deterministic and therefore useful (actually as an aside, this also harkens to language design, where recently languages have been moving toward immutable variables and idempotent functions -- two features that combined help deal with nondeterministic output in programs, thereby making them easier to debug).

    I think what's going to happen is the following:

    - A lot of people will try to reduce nondeterminism in LLMs through natural language constrained by formalisms (prompt engineering)

    - Those formalisms will prove insufficient and people will move to LLMs constrained with formal languages that work with LLMs. Something like SQL queries that can talk to a database.

    - Those formal languages will work nicely enough to do simple things like collecting data and making view on them, but they will prove insufficient to build systems with. That's when programming languages and LLMs come back together, full circle.

    Ultimately, my feeling is the idea we can program without programming languages is misunderstanding what programming languages are; programming languages are not for communicating with a computer, they are for communicating ideas in an unambiguous way, whether to a computer or a human or an LLM. This is important whether or not a machine exists to execute those programs. After all, programming languages are languages.

    And so LLMs cannot and will not replace programming languages, because even if no computers are executing them, programs still need to be written in a programming language. How else are we to communicate what the program does? We can't use English and we know why. And we can't describe the program to the LLM in English for the same reason. The way to describe the program to the LLM is a programming language, so we're stuck building and using them.

  • almostgotcaught 17 hours ago

    Do you like have any idea what you're talking about? Or are you just making it up for internet points? The target is part of the input.

    Lemme ELI5

    https://github.com/llvm/llvm-project/tree/main/llvm/test/Cod...

    You see how this folder has folders for each target? Then within each target folder there are tests (thousands of tests)? Each of those tests is verified deterministically on each commit.

    Edit: there's an even more practical way to understand how you're wrong: if what you were saying were true, ccache wouldn't work.