Comment by larodi

Comment by larodi 10 months ago

9 replies

I'm waiting for peoples of AI to discover syllogism and inference in its original PROLOG sense, which this CoT abomination basically tries to achieve. Interestingly, if all logical content is translated to rules, and then only rules are fed into the LLM training set, what would the result be, and can the probabilistic magic be made into actually following reason without all the dice.

trescenzi 10 months ago

Right we’ve now gotten to the stage of this AI cycle where we start using the new tool to solve problems old tools could solve. Saying a transformer can solve any Formally decidable problem if given enough tape isn’t saying much. It’s a cool proof, don’t mean to deny that, but it doesn’t mean much practically as we already have more efficient tools that can do the same.

  • marcosdumay 10 months ago

    What I don't get is... didn't people prove that in the 90s for any multi-layer neural network? Didn't people prove transformers are equivalent on the transformers paper?

    • Nevermark 10 months ago

      Yes they did. A two layer network with enough units in the hidden layer can form any mapping to any desired accuracy.

      And a two layer network with single-delay feedback from the hidden units to themselves can capture any dynamic behavior (to any desired accuracy).

      Adding layers and more structured architectures creates the opportunity for more efficient training and inference, but doesn't enable any new potential behavior. (Except in the sense that reducing resource requirements can allow impractical problems to become practical.)

      • larodi 10 months ago

        Putting a 50 bucks bet that some very smart kid in the near future will come with some enthrophy-meets-graphical-structures theorem which gives an estimation of how the loss of information is affected by the size and type of the underlying structure holding this information.

        It took a while for people to actually start talking about LZW as grammar algo, not a "dictionary"-based algorithm. Which is then reasoned about in a more general sense again by https://en.wikipedia.org/wiki/Sequitur_algorithm.

        This is not to say that LLMs are not cool, we put them to use every day. But the reasoning part is never going to be a trustworthy one without a 100% discreet system, which can infer the syllogistic chain with zero doubt and 100% tracable origin.

sunir 10 months ago

I was thinking about the graphrag paper and prolog. I’d like to extract predicates. The source material will be inconsistent and contradictory and incomplete.

Using the clustering (community) model, an llm can summarize the opinions as a set of predicates which don’t have to agree and some general weight of how much people agree or disagree with them.

The predicates won’t be suitable for symbolic logic because the language will be loose. However an embedding model may be able to connect different symbols together.

Then you could attempt multiple runs through the database of predicates because there will be different opinions.

Then one could attempt to reason using these loosely stitched predicates. I don’t know how good the outcome would be.

I imagine this would be better in an interactive decision making tool where a human is evaluating the suggestions for the next step.

This could be better for planning than problem solving.

  • larodi 10 months ago

    Hm... a RAG over DB of logical rules actually may be interesting. But loosely stitched predicates you can easily put to work with some random dice when you decide inference.

    Chris Coyne of OKCupid and KeyBase (https://chriscoyne.com/) produced ContextFree (https://www.contextfreeart.org/) before all that. It is a grammar-based inference with probabilistic chance for the inference of the next rule. Very very very inspiring, not only because of the aesthetic side of the result. Digging further you find ProbLog which allows probabilities for rules (https://dtai.cs.kuleuven.be/problog/tutorial/basic/08_rule_p...)

    So how about we start thinking of AI as combination of the graphical probabilistic whatever which compresses the infromation from the training set in a very lossy manner; which is then hooked, internally or externally, with a discreet logical core, whenever CoT is needed. So this construct now can benefit from both worlds.

pkoird 10 months ago

I've said this before and I'll say it again: Any sufficiently advanced LLM is indistinguishable from Prolog.

detourdog 10 months ago

I’m surprised that understanding how to be thought unfolds is being considered not relevant to the answer. I have done a lot of problem solving in groups and alone. How thoughts develop seems fundamental to understand the solutions.

The story regarding the banning of terms that can be used with a reasoning system is a big red flag to me.

This sort of knee jerk reaction displays immature management and an immature technology product.

  • larodi 10 months ago

    a little late to reply, but perhaps you see this. does it not make impression to you that lots of these articles on AI that get published are very childish. not in the math sense, but in the rasoning sense. besides, most of them are anything but interdisciplinary. I've almost never encountred prompt engineers who actually tried to delve into what GPTs do, and then these CoT guys, they don't know a thing about predicat logic, yet try to invent it anew.

    On your comment reg. banning tokens/terms we are on the same page. We can agree all of this is very immature, and many of the peoples also, including this lot of chinese kids who seem to put out one paper per our. You see, the original seq2seq paper is 8 pages, topic included. Can you imagine? But Sutskever was ot a child back then, he was already deep into this all. We can easily state/assume the LLM business is in its infancy. It may easily stay there for a century until everyone levels up.