Project Details
Emergence of Abstract Representations in Contextualized Multimodal Models
Applicant
Professorin Dr. Gemma Roig
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Human Cognitive and Systems Neuroscience
Human Cognitive and Systems Neuroscience
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 459426179
Abstract representations, in AI models and in the brain, can be defined at different levels. At a low level there are representations that are abstractions from some perceptual variable (such as the point of view in visual objects). Higher levels of abstraction are defined as representations which characterize meaning independently from input modality (e.g. seeing or hearing). Such representations might also be shaped by contextual information perceived in addition to the isolated concept. For example, a sound and a visual object might refer to the same concept and the context in which it is perceived can guide its abstract representation. For instance, hearing the sound of a barking dog or seeing the actual dog, are both connected to the “dog concept”. Similarly, even hearing someone talking about a dog or seeing a dog house, can evoke the idea of a dog. In this project, we will focus on a higher level of abstraction, independent of input modality and using contextual information. We will employ deep neural networks (DNNs) as a base, since these are hierarchical models originally inspired by the visual cortex. Moreover, DNNs are state-of-the-art for several AI applications, such as object classification in images, and natural language processing (NLP). We will develop new multimodal DNNs leveraging the different co-occurring input modalities, tackling first the computational questions: How do AI models, particularly DNN-based models, learn abstract concepts of semantics, independently of input modality? What is the role of contextual information? What are the computational advantages (data efficiency during learning, robustness to input changes and noise) of those multimodal models that learn abstract representations compared to unimodal models? Then, we will adapt and employ the newly developed models to explain the human data collected in the other projects of the ARENA research unit to understand relations between the abstract representations in the models and human data, and thus, characterize the brain representations using the models. Overall, the ARENA research unit will allow the study of abstract representations in AI models and brain with a synergistically tight collaboration among all the projects to bridge knowledge in AI, neuroscience and cognitive neuroscience.
DFG Programme
Research Units