Project Details
Generating and Answering Ontological Queries over Semi-structured Medical Data
Applicant
Professor Dr.-Ing. Franz Baader
Subject Area
Theoretical Computer Science
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2015 to 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 284232554
More and more information on individuals (e.g., persons, events, biological objects) are available electronically in a structured or semi-structured form. Selecting individuals satisfying certain constraints based on such data manually is a complex, error-prone, and time and personnel-consuming effort. For this reason, tools that can automatically or semi-automatically answer questions based on the available data need to be developed. While simple questions can directly be expressed and answered using keywords in natural language, complex questions that can refer to type and relational information increase the precision of the retrieved results, and thus reduce the effort for posterior manual verification of the results. One example for this situation is the setting where electronic patient records are used to find patients satisfying non-trivial combinations of certain properties, such as eligibility criteria for clinical trials. Another example that will also be considered as a use case in this project is the setting where a student asks the examination office questions about study and examination regulations. In both cases, the original question is formulated in natural language.In the GoAsq project, we will investigate, compare, and finally combine two different approaches for answering questions formulated in natural language over textual, semi-structured, and structured data. One approach uses the expertise in text-based question answering of the French partners to directly answer natural language questions using natural language processing and information retrieval techniques. The other tries to translate the natural language questions into formal, database-like queries and then answer these formal queries w.r.t. a domain-dependent ontology using database techniques. The automatic translation is required since it would be quite hard for the people asking the questions (medical doctors, students) to formulate them as formal queries. The ontology allows to overcome the possible semantic mismatch between the person producing the source data (e.g., the GPs writing the clinical notes) and the person formulating the question (e.g., the researcher formulating the trial criteria). GoAsq can thus leverage recent advances obtained in the ontology community on accessing data through ontologies, called ontology-based query answering (OBQA). More precisely, in Task 1 of the project we investigate the two use cases mentioned above (eligibility criteria; study regulations). In Task 2 we will introduce and analyze extensions to existing formal query languages that are required by these use cases. Task 3 will develop techniques for extracting formal queries from textual queries, and Task 4 will evaluate the approach obtained this way, compare it with approaches for text-based question answering, and develop a hybrid approach that combines the advantages of both.
DFG Programme
Research Grants
International Connection
France
Partner Organisation
Agence Nationale de la Recherche / The French National Research Agency
Cooperation Partners
Professorin Dr. Brigitte Grau; Dr. Yue Ma