Project Details
Scaling Beyond DRAM with PMem without Compromising Performance
Applicant
Professor Alfons Kemper, Ph.D.
Subject Area
Security and Dependability, Operating-, Communication- and Distributed Systems
Term
from 2017 to 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 361477420
Over the last decade our group has focused on developing in-memory database technology, most notably the HyPer database. The increases in main memory capacities over the last two decades have made in-memory data processing feasible while offering unprecedented performance. Even though most transactional data fits into the largest scale-up servers that facilitate several TB of DRAM capacity, we should reconsider the new opportunities of persistent memory (PMem) for two reasons: (1) its cost/performance ratio appears beneficial in comparison to pure DRAM systems and (2) new applications in the Big Data era require even higher data volumes. Other developments try to tackle the latter with scale-out solutions. However, they incur a much higher communication overhead – even when using new costly communication technologies like InfiniBand and RDMA. As an alternative, we want to scale beyond DRAM capacity by exploiting emerging PMem capabilities in scale-up servers without compromising the pure in-memory processing performance (for those working sets that fit into the large capacity DRAM and PMem). Working sets beyond DRAM plus PMem capacity should induce only a graceful degradation by relying on fast SSD storage. The research work proposed here shall be integrated into the Umbra database system. Umbra is the "spiritual" successor and evolution of our successful pure in-memory system HyPer. It completely eliminates the restrictions of HyPer that data fits into main memory. Umbra is a long-term research project with many Big Data applications. In this three-year project, we concentrate on how modern storage systems such as persistent memory can be integrated as a first-class citizen of Umbra. This requires to re-design the storage- and index-structures for data as well as key functional components like buffering, logging and recovery. The overall goal that we want to achieve is scaling beyond costly DRAM capacity without slowing down if the working set fits into DRAM plus PMem and gracefully degrading if it grows beyond. The full-fledged database system prototype Umbra serves as an integration platform for researching the spectrum of possibilities for the incorporation of persistent memory into a database system. Thereby, Umbra constitutes a realistic (and invaluable) test bed for an end-to-end investigation to achieve and prove practical relevance of the foundational research. The (almost unique) broad expertise of the PIs’ groups covering database as well as OS experience (after Prof. Jana Giceva joined the TUM group) will facilitate the exploitation of the modern storage devices most effectively – either at DBMS or "deeper" and more generic at the OS level.
DFG Programme
Priority Programmes
Subproject of
SPP 2037:
Scalable Data Management on Future Hardware
Co-Investigators
Professorin Dr. Jana Giceva; Professor Dr. Thomas Neumann