Skip to yearly menu bar Skip to main content


Poster
in
Workshop: ICLR 2025 Workshop on Bidirectional Human-AI Alignment

Vision Language Models Know Law of Conservation without Understanding More-or-Less

Dezhi Luo · Haiyun Lyu · Qingying Gao · Haoran Sun · Yijiang Li · Hokin Deng


Abstract:

Conservation is a critical milestone of cognitive development considered to be supported by both the understanding of quantitative concepts and the reversibility of operations. To assess whether this critical component of human intelligence has emerged in Vision Language Models, we have curated the ConserveBench, a battery of 365 cognitive experiments across four dimensions of physical quantities: volume, solid quantity, length, and number. The former two involve transformational tasks which require reversibility understanding. The latter two involve non-transformational tasks which assess quantity understanding. Surprisingly, we find that while Vision Language Models are generally good at transformational tasks, they tend to fail at non-transformational tasks. There is a dissociation between understanding the reversibility of operations and understanding of quantity, which both are believed to be the cornerstones of the understanding of law of conservation in humans.

Chat is not available.