Cross-modal representation learning, with applications to search in radiology reports and auto-filling of report templates (B05)

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Nuclear Medicine, Radiotherapy, Radiobiology
Term since 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 499552394
 

Project Description

We will look into the working principles of the pre-training of multi-modal models with image and text data, and test if these can be investigated also with much smaller data subsets. In particular, we will investigate whether the use of ontologies and knowledge graphs for creating a training dataset allows this dataset to be much smaller while generalizing better to unseen data. Moreover, we will investigate the limitations of learned image-text models by explicitly testing the representation of grammatical structures, their corresponding visual patterns, and their recombination.
DFG Programme Collaborative Research Centres
Subproject of SFB 1597:  Small Data
Applicant Institution Albert-Ludwigs-Universität Freiburg
Project Heads Professorin Dr. Hannah Bast; Professor Dr.-Ing. Thomas Brox; Professor Dr. Elmar Kotter