Project Details
Projekt Print View

Declarative Performance for Datalog

Subject Area Software Engineering and Programming Languages
Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Term since 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 508316729
 
Lately, the query language Datalog is gaining more and more attention as a powerful and flexible tool in various domains, such as program analysis or distributed computing. Datalog being a declarative language, a user must only specify what a computation produces, not how the computation is supposed to produce that result. Indeed, finding an efficient execution plan is left to the Datalog engine. Unfortunately, state-of-the-art engines perform little to no automatic optimization while generating an execution plan. At best, they rely on manually annotated hints, which require a deep understanding of the processing pipeline from the user. Such expert knowledge is not only unattainable for most users but also breaks the declarative nature of Datalog. In practice, this lack of automatic query optimization causes severe performance problems.In this project, we thus aim at integrating true declarative performance in Datalog engines. Without any manual user intervention, the Datalog engine should be able to translate the declarative program into a highly optimized execution plan. To approach declarative performance, we propose to perform a combination of (1) analytical optimizations and (2) empirical optimizations of Datalog programs. On the analytical side, we will develop whole-program data-flow analyses, which allow us to perform non-local optimizations, such as constraint elimination or subquery specialization. On the empirical side, we will develop strategies to profile and to physically (re-)optimize the execution plan against representative input data in a feedback loop. This will allow the engine to incrementally identify the appropriate operators, physical layouts, and data structures. Further, we will tackle the highly performance-critical join order problem as well as identifying materialization strategies by combining information from both the analytical and empirical approaches. In the course of this project, we aim at integrating the collected research results in the widely used state-of-the-art Datalog engine Soufflé to make our results directly usable for the community.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung