Project Details
Automatic Quality Assessment: NLP approach for the semantic mapping of texts in the life science (AQUAS)
Applicant
Professor Dr. Konrad Förstner
Subject Area
Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 509313233
The growing incidence of deliberately spread misinformation poses a major challenge to our democratic society. They are increasingly being spread by politicalinterest groups in order to determine the public discourse. The recipients sometimes fail to recognize this misinformation as such. Since disinformation can also be found in scientific information, this development also affects scientists. In the medical applications of the life sciences, this can have health-threatening effects.In the project AQUAS presented here, the first German-language dataset on disinformation in the life sciences will be created. On this basis, modern machine learning (ML) methods will be used to create an ML model that will be able to classify the semantic proximity of unknown texts to the classes scientific texts, popular science texts and disinforming texts. Additionally, complementary information on the good scientific practices of the publications will be provided. With the enrichment and publication of the mentioned information (basic set and extended set of features, respectively) AQUAS aims at supporting the readers to make an informed assessment of literature. Thereby AQUAS does not aim at a final reading recommendation of the contents or censorship.Based on the developed enrichment method, a service will be implemented within AQUAS, which can be accessed via a programming interface (API - Application Programming Interface). As a first central application, we will use this service through the ZB MED discovery system LIVIVO to make the described classification of literature available to the users of ZB MED. Thus, initially life scientist and practitioners in health professions as well as students will benefit from the improved knowledge infrastructure at LIVIVO through AQUAS. The data set, the model, the workflow for training as well as the software for operating the service will be made openly available, if possible, and thus also made usable for other subject areas.
DFG Programme
Research data and software (Scientific Library Services and Information Systems)