invited talk
in
Workshop: Representational Alignment Mon, Apr 27, 2026 • 6:30 AM – 7:00 AM PDT

Robusto-2: Benchmarking Humans and VLMs for Autonomous Driving in Lima & NYC

Arturo Deza

Project Page

Abstract

As Self-Driving Cars continue to be deployed in different cities around the world: how well will these systems generalize when exposed in new geographies? Moreover, how well will current multi-modal VLMs (Vision Language Models) be able to cognitively understand and act when faces with bizzare edge-case scenarios. In this talk I will aim to answer these questions through a Visual Question Answering (VQA) framework, where we show humans and VLMs a series of our own recorded dashcam footage from Lima and New York City and test for system divergence and convergence. Moreover we tests for these similarities/divergences in a factorial analysis with 3 groups: Humans from NYC, Humans from Lima and VLMs; and two first-person dashcam data recorded from both Lima and New York City.

Speaker

Arturo Deza

CEO of Artificio

Video

Chat is not available.