A Graph Query Processor for Queries of Class CRPQagg
Final Report Abstract
Large graph-structured data sets are becoming increasingly important in data science. Analyzing social networks, querying the web of data, managing knowledge networks, studying the interaction of proteins, planning transportation grids, and routing network traffic are all problems that work with graph data. To address the challenges to data management and processing that arise from this development, this project’s objective was to design and develop a graph query processor. In particular, we focused on languages that fall into the class of conjunctive regular path query languages with aggregation (CRPQagg ), i.e., languages that support subgraph pattern matching but also, for example, shortest path queries. Our project followed a two-pronged approach. On the one hand, we addressed the problem “top-down” by starting from existing query languages that fall into class CRPQagg . To formally describe these query languages, we defined a graph data model as well as a graph query algebra and algebraic equivalences. Together with a statistical model to estimate the cardinality of graph query results, these contributions form the logical level of a graph query processor. On the other hand, a second line of research was conducted bottom-up by starting from existing graph algorithms in order to build the physical operators of a graph query processor. In the scope of this second line of research, the project contributed to understanding whether and how indexing and processing techniques that have been designed for specific types of graphs can be applied to general graphs. Even though this project made significant steps towards designing and developing a general graph query processor, some challenges remain open and we are currently working to address them. At the same time, new research questions have appeared during the course of the project. In particular, the rise of machine learning also presents opportunities for graph query processing. We are currently working on a new approach to result cardinality estimation for graph queries based on machine learning.
Publications
-
An Algebra and Equivalences to Transform Graph Patterns in Neo4j. In Proc. EDBT/ICDE Workshop on Querying Graph Structured Data (GraphQ), 2016
Jürgen Hölsch and Michael Grossniklaus
-
On the Performance of Analytical and Pattern Matching Graph Queries in Neo4j and a Relational Database. In Proc. EDBT/ICDE Workshop on Querying Graph Structured Data (GraphQ), 2017
Jürgen Hölsch, Tobias Schmidt, and Michael Grossniklaus
-
Experiences with Implementing Landmark Embedding in Neo4j. In Proc. SIGMOD/PODS Workshop on Graph Data Management Experiences & Systems (GRADES), pages 7:1-7:9, 2019
Manuel Hotz, Theodoros Chondrogiannis, Leonard Worteler, and Michael Grossniklaus