Project Details
Projekt Print View

Data Integration and Access by Merging Ontologies and Databases (DIAMOND)

Subject Area Theoretical Computer Science
Term from 2013 to 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 239756895
 
Final Report Year 2022

Final Report Abstract

The DFG Emmy Noether Project DIAMOND (Data Integration and Access by Merging Ontologies and Databases) has conducted foundational and applied research on the general areas of symbolic artificial intelligence and data management. The main goal was to advance the principle of ontologybased data access, which allows users to obtain more complete and more adequate answers to questions by taking background knowledge about the meaning of the underlying data into account. This underlying data can be represented as a traditional database or as a knowledge graph, which puts more emphasis on the relationships and connections of elements of the domain. The background knowledge in turn is represented as an ontology : a formal conceptual model that can be evaluated automatically to infer additional knowledge. Applying this idea to real-world tasks leads to many challenges, chiefly covered by two questions: 1. How should background knowledge be modelled (capturing all relevant information)? 2. How can query answers be computed (with the performance needed in real applications)? Project DIAMOND has addressed these questions through a variety of foundational and engineeringoriented research contributions. On the foundational side, the main topics have been the design of suitable ontology-languages, the development of methods for reasoning with them, and the study of the theoretical properties of these languages. Notable outcomes of this work have been rulebased reasoning approaches for description logic ontologies, a novel type of attributed logics that support rich knowledge graphs, and insights into the use of existential rules to simulate set-valued data. Additional theoretical contributions have been made to the theory of sub-regular languages and related automata models. On the practical side, the project has contributed to the development of the rule engine VLog, which supports reasoning in the rule-based ontology languages investigated in the project. Its specific strength and innovation is the use of memory-efficient column-based data structures, which enables VLog to handle large datasets even on common hardware. The project has made further contributions to the design of algorithms and data structures for large-scale distributed graph databases. Results of DIAMOND have influenced important applications, most notably Wikidata, the successful sister project of Wikipedia that has become the most important free knowledge graph. DIAMOND’s contributions include the first approach of encoding Wikidata in the graph database format RDF, which has become the basis for the powerful Wikidata Query Service and is answering millions of queries each day. In cooperation with the Wikimedia Foundation, DIAMOND researchers have made contributions to the design of this service, and extracted a large research data set of over 570 million real-world queries, which can help to understand the practical use of large knowledge bases. The research output of the project comprises over 70 reviewed publications, many of which have appeared at leading international venues and which have already attracted thousands of citations. The quality and visibility of the research is further recognised by two best paper awards, an honourable mention at the largest AI conference, and three best paper nominations – all at top-ranked international conferences. Principal investigator Krötzsch received the DFG Heinz Maier-Leibnitz-Preis for his contributions.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung