Comment by COAGULOPATH

Comment by COAGULOPATH 4 days ago

10 replies

Came here hoping to find this.

You will not unlock "o1-like" reasoning by making a model think step by step. This is an old trick that people were using on GPT3 in 2020. If it were that simple, it wouldn't have taken OpenAI so long to release it.

Additionally, some of the prompt seems counterproductive:

>Be aware of your limitations as an llm and what you can and cannot do.

The LLM doesn't have a good idea of its limitations (any more than humans do). I expect this will create false refusals, as the model becomes overcautious.

anshumankmr 4 days ago

>The LLM doesn't have a good idea of its limitations (any more than humans do). I expect this will create false refusals, as the model becomes overcautious.

Can it not be trained to do so? From my anecdotal observations, the knowledge cutoff is one thing that LLMs are really well trained to know about. Those are limitations that LLMs are currently well trained to handle. Why can it not be trained to know that it is quite frequently bad at math, it may produce sometimes inaccurate code etc.

For humans also, some people know some things are just not their cup of tea. Sure there are times people may have half baked knowledge about things but one can tell if they are good at XYZ things, and not so much at other things.

  • fudged71 4 days ago

    It's a chicken and egg situation. You don't know a model's capabilities until it is trained. When you then change the training with that learning, it will have modified capabilities.

  • regularfry 4 days ago

    Apart from anything else there will be a lot of text about the nature of LLMs and their inherent limitations in its training set. It might only need to be made salient the fact that it is one in order to produce the required effect.

whimsicalism 4 days ago

you’re wrong and stating things confidently without the evidence to back it up.

alignment is a tough problem and aligning long reasoning sequences to correct answer is also a tough problem. collecting high quality CoT from experts is another tough problem. they started this project in october, more than plausible it could take this time

Meganet 4 days ago

You actually don't know that.

A LLM has a huge amount of data ingested. It can create character profiles, audience, personas etc.

Why wouldn't it have potentially even learned to 'understand' what 'being aware of your limitations' means?

Right now for me 'change of reasoning' feels a little bit of quering the existing meta space through the reasoning process to adjust weights. Basically priming the model.

I would also not just call it a 'trick'. This looks simple, weird or whatnot but i do believe that this is part of AI thinking process research.

Its a good question though what did they train? New Architecture? More parameters? Is this training a mix of experiments they did? Some auto optimization mechanism?

  • Hugsun 4 days ago

    It might understand the concept of it having limitations, but it can't AFAIK reliably recognize when it does or doesn't know something, or has encountered a limitation.

    • Meganet 4 days ago

      Its the same thing as with humans, thats right. It doesn't do Logical reasoning but even the best humans stop at some level.

      But if you read all the knowledge of humans, were does your reasoning start? Probably at a very high level of it.

      If you look at human brains, we conduct experiments right? As a software developer, we write tests. ChatGPT can already run python code and it can write unit tests.

      We do not use proofs when we develop. An AI could actually doing this. But at the end its more of a question who does it better, faster and cheaper eh?

      • Hugsun 6 hours ago

        There is an important difference between humans and LLMs in this context.

        Humans do in most cases have some knowledge about why they know the things they know. They can recall the topics they learned at school, and can deduce that they probably heard a given story from a friend who likes to discuss similar topics, etc.

        LLMs have no access to the information they were trained on. They could know that everything they know was learned during the training, but they have no way of determining what they learned about and what they didn't.

    • stevenhuang 4 days ago

      If you think about it, those criticisms extend to human thinking too. We aren't infallible in all situations either.

      It's only when we can interact with the environment to test our hypothesis that we then refine what we know and update our priors appropriately.

      If we let LLMs do that as well, by allowing it to run code and interact with documentation/the internet and double-check things its not sure of, it's not out of the question LLMs won't eventually be able to more reliably understand its limitations.

      • Hugsun 5 hours ago

        As they are currently constructed, I would say that it is out of the question.

        Humans usually know (at least roughly) the source of anything they know, as there will be a memory or a known event associated with that knowledge.

        LLMs have no analogous way to determine the source of their knowledge. They might know that all their knowledge comes from their training, but it has no way of knowing what was included in the training and what wasn't.

        This could maybe be achieved with some more fancy RAG systems, or online training abilities. I think an essential piece is the ability to know the source of information. When LLMs reliably do, and apply that knowledge, they'll be much more useful. Hopefully somebody can achieve this.