Comment by nharada

Comment by nharada 8 hours ago

1 reply

I think the assumption is valid. Most of the reasoning components of the next gen (and some current gen) robotics will use VLMs to some extent. Deciding if a temporary construction sign is valid seems to fall under this use case.

theamk 4 hours ago

But unless you are using a single, end-to-end model for the entire driving stack, that "proceed" command will never influence accelerator pedal.

Sure, there will be a VLM for reading the signs, but the worst it'd be able to output is things like "there is a "detour" sign at (123, 456) pointing to road #987" - and some other, likley non-LLM, mechanism will ensure that following that road is actually safe.