Generating and Answering Ontological Queries over Semi-structured Medical Data
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Final Report Abstract
More and more information on individuals (e.g., persons, events, biological objects) is available electronically in a structured or semi-structured form. However, selecting individuals satisfying certain constraints based on such data manually is a complex, error-prone, and time and personnel-consuming effort. For this reason, tools that can automatically or semiautomatically answer questions based on the available data need to be developed. While simple questions can directly be expressed and answered using keywords in natural language, complex questions that can refer to type and relational information increase the precision of the retrieved results, and thus reduce the effort for posterior manual verification of the results. One example for this situation is the setting where electronic patient records are used to find patients satisfying non-trivial combinations of certain properties, such as eligibility criteria for clinical trials. In the GoAsq project, we have adressed this problem by translating the natural language questions into formal, database-like queries and then answering these formal queries w.r.t. a domain-dependent ontology using database techniques. The automatic translation is required since it would be quite hard for the people asking the questions (e.g., medical doctors) to formulate them as formal queries. The ontology allows to overcome the possible semantic mismatch between the person producing the source data (e.g., the GPs writing the clinical notes) and the person formulating the question (e.g., the researcher formulating the trial criteria). To realize this approach and apply it to the use-case of finding patients satisfying eligibility criteria for clinical trials, the existing approaches developed in the ontology community for accessing data through ontologies, called ontology-based query answering (OBQA), had to be extended in several directions. The goal of these extensions was to develop ontology and query languages that are expressive enough to express eligibility criteria for clinical trials in a semantically adequate way. On the theoretical side, we investigated extensions by fuzzy logic, probabilistic logic and databases, concrete (e.g., numerical) domains to express measurements, metric temporal logics that can express time spans and declare symbols to have a fixed interpretation during a certain time interval, and a novel non-classical negation operator that can deal with the fact that patient data usually do not contain negative information. On the practical side, we have implemented an automatic translation approach of eligibility criteria for clinical trials into a query language that uses metric temporal logic and the developed non-classical negation. In addition, we implemented a query answering system for such queries. The implementations were evaluated on real-world clinical studies collected from clinicaltrials.gov and anonymized patient data from the MIMIC-III patient database and from the 2018 N2C2 cohort selection challenge.
Publications
- “Finding New Diamonds: Temporal Minimal-World Query Answering over Sparse ABoxes”. In: Proc. of the 3rd International Joint Conference on Rules and Reasoning (RuleML+RR’19). Edited by Paul Fodor, Marco Montali, Diego Calvanese, and Dumitru Roman. Volume 11784. Lecture Notes in Computer Science. Springer, 2019, pages 3–18
Stefan Borgwardt, Walter Forkel, and Alisa Kovtunova
(See online at https://doi.org/10.1007/978-3-030-31095-0_1) - “Query Rewriting for DL-Lite with n-ary Concrete Domains”. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI’17). Edited by Carles Sierra. 2017, pages 786–792
Franz Baader, Stefan Borgwardt, and Marcel Lippmann
(See online at https://doi.org/10.24963/ijcai.2017/109) - “Patient Selection for Clinical Trials Using Temporalized Ontology-Mediated Query Answering”. In: Proc. of the 1st Int. Workshop on Hybrid Question Answering with Structured and Unstructured Knowledge (HQA’18), Companion of the The Web Conference 2018. ACM, 2018, pages 1069–1074
Franz Baader, Stefan Borgwardt, and Walter Forkel
(See online at https://doi.org/10.1145/3184558.3191538) - “Automatic Translation of Clinical Trial Eligibility Criteria into Formal Queries”. In: Proc. of the 9th Workshop on Ontologies and Data in Life Sciences (ODLS’19), part of The Joint Ontology Workshops (JOWO’19). Edited by Martin Boeker, Ludger Jansen, Frank Loebe, and Stefan Schulz. Volume 2518. CEUR Workshop Proceedings. 2019
Chao Xu, Walter Forkel, Stefan Borgwardt, Franz Baader, and Beihai Zhou
- “Closed-World Semantics for Conjunctive Queries with Negation over ELH⊥ Ontologies”. In: Proc. of the 16th European Conf. on Logics in Artificial Intelligence (JELIA’19). Edited by Francesco Calimeri, Nicola Leone, and Marco Manna. Volume 11468. Lecture Notes in Artificial Intelligence. Springer, 2019, pages 371–386
Stefan Borgwardt and Walter Forkel
(See online at https://doi.org/10.1007/978-3-030-19570-0_24) - “Metric Temporal Description Logics with Interval-Rigid Names”. In: ACM Transactions on Computational Logic 21.4 (2020), 30:1–30:46
Franz Baader, Stefan Borgwardt, Patrick Koopmann, Ana Ozaki, and Veronika Thost
(See online at https://doi.org/10.1145/3399443)