Comment by sanderjd

Comment by sanderjd 2 days ago

12 replies

On the tool-invocation point: Something that seems true to me is that LLMs are actually too smart to be good tool-invokers. It may be possible to convince them to invoke a purpose-specific tool rather than trying to do it themselves, but it feels harder than it should be, and weird to be limiting capability.

My thought is: Could the tool-routing layer be a much simpler "old school" NLP model? Then it would never try to do math and end up doing it poorly, because it just doesn't know how to do that. But you could give it a calculator tool and teach it how to pass queries along to that tool. And you could also give it a "send this to a people LLM tool" for anything that doesn't have another more targeted tool registered.

Is anyone doing it this way?

dvt 2 days ago

> Is anyone doing it this way?

I'm working on a way of invoking tools mid-tokenizer-stream, which is kind of cool. So for example, the LLM says something like (simplified example) "(lots of thinking)... 1+2=" and then there's a parser (maybe regex, maybe LR, maybe LL(1), etc.) that sees that this is a "math-y thing" and automagically goes to the CALC tool which calculates "3", sticks it in the stream, so the current head is "(lots of thinking)... 1+2=3 " and then the LLM can continue with its thought process.

  • namaria 2 days ago

    Cold winds are blowing when people look at LLMs and think "maybe an expert system on top of that?".

    • sanderjd 2 days ago

      I don't think it's "on top"? I think it's an expert system where (at least) one of the experts is an LLM, but it doesn't have to be LLMs from bottom to top.

      • namaria 2 days ago

        On the side, under, wherever. The point is, this is just re-inventing past failed attempts at AI.

  • sanderjd 2 days ago

    Definitely an interesting thought to do this at the tokenizer level!