Detailseite
Compiling and Optimizing Iterative Data Analysis Programs with Shared State on Evolving Datasets
Antragsteller
Professor Dr. Volker Markl
Fachliche Zuordnung
Sicherheit und Verlässlichkeit, Betriebs-, Kommunikations- und verteilte Systeme
Förderung
Förderung von 2013 bis 2017
Projektkennung
Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 132320961
The goal of Project A within the Stratosphere II research unit is to research, design, and develop a data programming language, an associated optimizing compiler, an intermediate data- and control- flow representation, and an optimizer that determines efficient execution strategies for workloads of data analysis programs with iterations and stateful operators, both over static data and over infinite, evolving datasets. The project will research language abstractions for specifying iterations and state, novel optimizations for iterative and stateful programs targeting performance as well as novel fault tolerance schemes for massively parallel iterative algorithms, and optimization of workloads of programs including work sharing between programs. The project will also demonstrate the overall effectiveness of Stratosphere II by integrating the results of all projects in a coherent system, identifying a relevant use-case workload, and evaluating and benchmarking the system performance. In particular, this project aims at answering following questions:1. What are the necessary language and system primitives to abstract parallelization and state, and expose mutable state management to the programmer of DAPs without compromizing scalability, performance, and fault tolerance?2. What are the optimizing program transformations that apply to a DAP with state and iterations and create a more efficient program?3. To what extent can an optimizing language compiler optimize advanced data analytics applications with state and iterations?4. To what extent can we support the fault-tolerant and efficient execution of DAPs with state via languagelevel features that expose algorithmic aspects of programs?5. How can we build an optimizer for workloads of DAPs to optimize state management across DAPs?
DFG-Verfahren
Forschungsgruppen
Teilprojekt zu
FOR 1306:
Stratosphere - Information Management on the Cloud