Project Details
Projekt Print View

Generating and Answering Ontological Queries over Semi-structured Medical Data

Subject Area Theoretical Computer Science
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2015 to 2020
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 284232554
 
Final Report Year 2021

Final Report Abstract

More and more information on individuals (e.g., persons, events, biological objects) is available electronically in a structured or semi-structured form. However, selecting individuals satisfying certain constraints based on such data manually is a complex, error-prone, and time and personnel-consuming effort. For this reason, tools that can automatically or semiautomatically answer questions based on the available data need to be developed. While simple questions can directly be expressed and answered using keywords in natural language, complex questions that can refer to type and relational information increase the precision of the retrieved results, and thus reduce the effort for posterior manual verification of the results. One example for this situation is the setting where electronic patient records are used to find patients satisfying non-trivial combinations of certain properties, such as eligibility criteria for clinical trials. In the GoAsq project, we have adressed this problem by translating the natural language questions into formal, database-like queries and then answering these formal queries w.r.t. a domain-dependent ontology using database techniques. The automatic translation is required since it would be quite hard for the people asking the questions (e.g., medical doctors) to formulate them as formal queries. The ontology allows to overcome the possible semantic mismatch between the person producing the source data (e.g., the GPs writing the clinical notes) and the person formulating the question (e.g., the researcher formulating the trial criteria). To realize this approach and apply it to the use-case of finding patients satisfying eligibility criteria for clinical trials, the existing approaches developed in the ontology community for accessing data through ontologies, called ontology-based query answering (OBQA), had to be extended in several directions. The goal of these extensions was to develop ontology and query languages that are expressive enough to express eligibility criteria for clinical trials in a semantically adequate way. On the theoretical side, we investigated extensions by fuzzy logic, probabilistic logic and databases, concrete (e.g., numerical) domains to express measurements, metric temporal logics that can express time spans and declare symbols to have a fixed interpretation during a certain time interval, and a novel non-classical negation operator that can deal with the fact that patient data usually do not contain negative information. On the practical side, we have implemented an automatic translation approach of eligibility criteria for clinical trials into a query language that uses metric temporal logic and the developed non-classical negation. In addition, we implemented a query answering system for such queries. The implementations were evaluated on real-world clinical studies collected from clinicaltrials.gov and anonymized patient data from the MIMIC-III patient database and from the 2018 N2C2 cohort selection challenge.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung