Poster
in
Workshop: How Far Are We From AGI
Towards Human-like Machine Vision: Representing Part-Whole Relationship with hierarchically correlated neuronal activation in neural networks
Hao Zheng · Hui Lin · Rong Zhao · Huanyu Qu
Keywords: [ neuroAI ] [ neural symbolic ] [ attractor dynamics ] [ object-centric representation ] [ active perception ] [ binding problem ] [ neural syntax ] [ Part-whole hierarchy ]
Representing hierarchical structure is a key problem that characterizes the gap between current neural network and human-like intelligence. While human brain flexibly extracts part-whole hierarchy from unstructured sensory input, how can a neural network with fixed connection weight flexibly capture such compositional structure is still an open question. Most efforts in machine learning field focus on slot-based methods to temporally tackle the problem. In this paper, we provide new insights on this challenge without resort to the “slot” idea. From a interdisciplinary viewpoint that combine neuroscientific hypothesis and machine learning models, we propose the Composer, which dynamically “correlates” its distributed neural activation into an emergent implicit hierarchical structure to represent the part-whole hierarchy of objects. The observed representation is consistent to the widely-discussed “neural syntax” in neuroscience. Therefore, we hope the Composer shed light on a new paradigm to develop human-like vision and to build up compositional structure without “slots”. We also invent quantitative measures to evaluate the parsing quality, which shows that the Composer can parse a range of synthetic scenes of different complexities. By incorporating advanced machine learning models like LLMs or diffusion models into the paradigm, the capability of Composer is promising to be scaled into real-world datasets in the future. Taken together, we believe the Composer can inspire and inform future innovations and development towards artificial general intelligence (AGI).