Comment by alyxya

Comment by alyxya 19 hours ago

2 replies

Personally, I think the direction AI will go towards is having an AI brain with something like a LLM at its core augmented with various abilities like spatial intelligence, rather than models being designed with spatial reasoning at its core. Human language and reasoning seems flexible enough to form some kind of spatial understanding, but I'm not so sure about the converse of having spatial intelligence derive human reasoning. Similar to how image generation models have struggled with generating the right number of fingers on hands, I would expect a world model designed to model physical space to not generalize the understanding of simple human ideas.

gf000 18 hours ago

> Human language and reasoning seems flexible enough to form some kind of spatial understanding, but I'm not so sure about the converse of having spatial intelligence derive human reasoning

I believe the zero hypothesis would be that a model natively understanding both would work best/come closest to human intelligence (and possibly other different modalities are also needed).

Also, as a complete laymen, our language having several interconnections with spatial concepts would also point towards a multi-modal intelligence. (topic: place, subject: lying under or near, respect/prospect: look back/ahead, etc). In my understanding these connections only secondarily make their way into LLM's representations.

  • alyxya 18 hours ago

    There's a difference between what a model is trained on and the inductive biases a model uses to generalize. It isn't as simple as saying training natively on everything. All existing models have certain things they generalize better and certain things they don't generalize due to their model architecture, and the architecture of world models I've seen don't seem as capable of universally generalizing as LLMs.