Project Details
Graph Partitioning for Graph Neural Networks
Applicant
Professor Dr. Ruben Mayer
Subject Area
Security and Dependability, Operating-, Communication- and Distributed Systems
Theoretical Computer Science
Theoretical Computer Science
Term
since 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 438107855
Graph partitioning is an important preprocessing step for distributed graph processing. The input graph is cut into a set of equally-sized components, while the cut size is minimized. This problem has been investigated in much depth in the past years and is now well-understood with respect to classical graph processing algorithms such as PageRank, Label Propagation, Shortest Paths, and others. However, more recently, distributed graph processing has been revised and extended to train deep neural networks, which is termed Graph Neural Networks (GNN) processing. While GNN processing has many similarities to traditional graph processing, such as asynchronous message passing and synchronous aggregation steps, there are also some significant differences. First and foremost, this regards larger vertex states, deep neural networks operations, and mini-batch training in combination with neighbor sampling. In light of these differences, in this project, we will review graph partitioning algorithms and adapt or extend them to be better suited for GNN processing. Current GNN systems do not fully exploit the opportunities of graph partitioning and merely perform slight adaptations of existing graph partitioning algorithms to the GNN processing scenario. To this end, we tackle the following research challenges: (I) Systematically study the effect of graph partitioning on the performance of GNN processing under various mini-batch sampling strategies. (II) Develop new graph partitioning strategies tailored to the needs of GNN processing. (III) Implement and integrate the new graph partitioning strategies into existing GNN processing frameworks to optimize the end-to-end pipeline. (IV) Extend the concepts by taking into account heterogeneous and dynamic graphs. Our project will contribute to the emerging field of GNN systems. The methods and techniques developed in this project are likely to find wide adoption both in academia as well as industry and open-source graph learning systems like PyTorch Geometric and Deep Graph Library (DGL). In particular, as GNN training is often performed on specialized hardware such as GPUs, which are expensive to obtain or lease in the cloud, optimizing the end-to-end runtime of graph partitioning and GNN training has the potential to provide immense cost savings.
DFG Programme
Research Grants