Comment by puttycat

Comment by puttycat 2 days ago

Seems very incremental and very far from the pompous 'superintelligence' goal.

antonvs 2 days ago

It’s unlikely that the existing LLM architecture will evolve into anything that resembles superintelligence any more than it does already.

Which means that modifications to the architecture, and combining it with other components and approaches, are the next likely step. This paper fits that.

Reply View 0 replies

btilly 2 days ago

If you can collapse "retrieve this complex chunk when it is needed" into a single token, what else can you put into a token?

"Send this through the math coprocessor." "Validate against the checklist." "Call out to an agent for X." "Recheck against input stream Y." And so on.

Retrieval augmentation is only one of many uses for this. If this winds up with better integration with agents, it is very possible that the whole is more than the sum of its parts.

Reply View 0 replies

lukev a day ago

Think about it this way; they are encoding whole "thoughts" or "ideas" as single tokens.

It's effectively a multimodal model, which handles "concept" tokens alongside "language" tokens and "image" tokens.

A really big conceptual step, actually, IMO.

Reply View 0 replies

naasking 2 days ago

A 30 fold improvement seems a tad more than incremental.

Reply View 3 replies

vasco 2 days ago

I can start brushing my teeth 30 times faster but it won't change my life. This is nice for RAG but it's a very localized improvement. And 30× sounds big but is just an order of magnitude improvement also.

Reply View | 2 replies
- naasking 2 days ago
  
  Brushing your teeth is not central to your life, recalling facts correctly is, and a 30 fold improvement in the latter very well could change your life. I'll leave it to you to figure out which is a better analogy to RAG.
  
  Reply View | 1 reply
  
  vasco 2 days ago
  
  Just remember that in this example you don't remember 30x more things, you just remember the same things 30x faster. That is a significant difference.
  
  Reply View | 0 replies