Project Details
Projekt Print View

Coordination

Subject Area Security and Dependability, Operating-, Communication- and Distributed Systems
Term from 2010 to 2019
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 132320961
 
The overall scientific goal of Stratosphere is to research and prototypically develop a novel, database-inspired information management system for the Cloud, for analyzing, aggregating, and querying very large data collections, made of either textual or (semi-) structured data. In order to achieve this goal, the proposed research unit is comprised of world-class experts in the areas of distributed systems, query processing, query optimization, information extraction, information integration and cleansing, security, and privacy. Through interdisciplinary orientation of many of its members, the research unit has close ties to potential users and can therefore validate the practical relevance of the invented technologies for several application domains, e.g., life-sciences, scientific applications, and for linked data. The geographical proximity of all research group members greatly facilitates collaboration and enables the establishment of regular joint seminars. Several research unit members will augment the research unit with additional resources, by bringing in own research staff or equipment. In addition, the research unit will be able to rely on a Cloud test-bed provided by TU Berlin and Humboldt-Universität zu Berlin. For large-scale demonstrations, the research unit will also have access to a full Cloud system setup at KIT with the support of Yahoo, HP, and Intel. The research unit will be able to use Hadoop, the open-source implementation of a data processor for a Cloud platform, as well as the opensource implementation of the JAQL query language, as a starting point for its research. Information management infrastructure researchers at IBM, Cloud providers at KIT, and users at the Potsdam Institut für Klimafolgenforschung (PIK) have already expressed interest in collaborating with the research unit if it is established, in order to assess the potential of Cloud computing for information management. We envision that the Stratosphere Cloud information management system research will serve as a basis for subsequent research activities in advanced areas of information management, such as scientific computing, web analysis, or next generation business intelligence. Stratosphere will leverage and extend open-source efforts like Hadoop and JAQL. We aim for lasting impact of our research by donating the code we develop to the opensource community. Our effort will enable other researchers and practitioners to use our implementations and to build on our work. After successful completion of the research unit, the principal investigators plan to continue and extend the research through a collaborative research center (SFB), which could build on our research results to provide near real-time processing on streaming data sets in a parallel environment in conjunction with considerations on business models, privacy, programming paradigms, and applications. We envision that our research will also impact the curriculum of database and distributed system classes. Overall, we expect that the Stratosphere research activities will lay the foundation for a Berlin-Brandenburg Information Management Group with long-term research collaboration, aligned curricula, and shared lectures among the participating universities.
DFG Programme Research Units
 
 

Additional Information

Textvergrößerung und Kontrastanpassung