Project Details
Projekt Print View

Flashy-DB: Impact of Flash Solid State Disks on Performance and Architecture of Data-Intensive Software Systems

Subject Area Software Engineering and Programming Languages
Term from 2010 to 2015
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 165012865
 
Final Report Year 2017

Final Report Abstract

SSDs have become a viable alternative to magnetic disks. However, the intrinsic properties of Flash memory, such as, asymmetric read/write, fast random access, erase before write resulting in out of place updates, and limited erase/write cycles affecting the lifetime of devices, requires rethinking the central components of a DBMS. Storage structures, buffer management, concurrency control, indexing, and query optimization must be systematically re-evaluated and re-designed. In the first phase we used an industrial-strength SSD-RAID system with block interface that masks the properties of Flash memory and lets the SSD appear like a common MMD. This facilitates substitution but incurs a high overhead. The block interface is realized through the Flash Translation Layer (FTL) running on the controllers. First evaluations showed that the HW-controllers rapidly become the bottleneck and preclude parallelism. Therefore, we used SW-controllers running on the CPU. In the second phase we avoided the FTL altogether and concentrated on developing strategies for a coherent integration of Flash management in the DBMS (NoFTL approach). For write-intensive OLTP workloads we developed a new multi-version concurrency control taking into consideration the influence of the DBMS storage layer, version organization, and the impact on buffer management. We developed new version-aware indices and investigated the impact of page structures and storage optimizations. We introduced a novel version organization and invalidation model that avoids short writes during timestamp invalidation. Results are bundled in a system prototype called SIAS (Snapshot Isolation with Append Storage) built on top of the PostgreSQL Open Source DBMS. Compared to PostgreSQL using SSDs, SIAS achieved 30% higher transactional throughput on the TPC-C benchmark, 13x lower transaction response time at the TPC-C saturation point, and a remarkable reduction of write amplification (3x under eager settings of the PostgreSQL background writer process and 33x on checkpoint intervals). In the second phase we concentrated on NoFTL building on the intuition that DBMS have assumed direct control over the hardware and the I/O stack to increase performance. NoFTL is a novel approach that exploits native Flash storage and access, and investigates coherent integration of Flash management in the DBMS. An integration of Flash management in Shore-MT demonstrated the viability of the native Flash interface and lead to 2.2x performance improvement compared to Shore-MT on RAW block-device Flash storage under various TPC workloads. The use of native Flash brings performance but is impractical for the DB administrator. We introduced NoFTL Regions, a physical storage structure to abstract from low-level Flash structures. NoFTL Regions can be coupled to logical database structures such as tablespaces, thus eliminating the administration overhead on native Flash. A careful region definition and a multi-region placement of DB-Objects according to their characteristics results in improved performance due to better use of Flash parallelism and less garbage collection overhead. This yields up to 60% higher transaction throughput compared to base NoFTL and up to 2x less erase operations, improving Flash longevity. We also introduced In-Page Appends (IPA) for write-intensive workloads. IPA transforms small in-place updates into small update deltas that are appended to the original page. IPA utilizes the commonly ignored fact that modern Flash memories (SLC, MLC, 3D NAND) can handle appends to already programmed physical pages by using various low-level techniques such as ISPP to avoid expensive erases and page migrations. We extended the traditional NSM page-layout with a delta-record area that can absorb those small updates. IPA has been implemented under Shore-MT and evaluated on real Flash hardware (OpenSSD). Under TPC-C, TPC-B and LinkBench IPA resulted in up to 45% higher transactional throughput, up to 74% reduction of erase operations, up to 60% lower read/write I/O latencies, and 2x to 3x reduction in overall write amplification compared to baseline NoFTL. Results were implemented on Open Source systems and can be transferred to industry.

Publications

  • Building Large Storage Based On Flash Disks. ADMS@VLDB 2010: 34-42
    I. Petrov, G. Almeida, A. Buchmann, U. Gräf
  • Page Size Selection for OLTP Databases on SSD Storage. SBBD 2010: 2:1-2:8
    I. Petrov, T. Ivanov, A. Buchmann
  • On the Performance of Database Query Processing Algorithms on Flash Solid State Disks. DEXA Workshops 2011: 139-144
    D. Bausch, I. Petrov, A. Buchmann
  • Page Size Selection for OLTP Databases on SSD RAID Storage. Journal of Information and Data Management, Vol. 2, No. 1, 2011
    I. Petrov, R. Gottstein, T. Ivanov, D. Bausch, A. Buchmann
  • SI-CV: Snapshot Isolation with Co-located Versions. TPCTC 2011: 123-136
    R. Gottstein, I. Petrov, A. Buchmann
  • A hybrid page layout integrating PAX and NSM. IDEAS 2012: 86-95
    Goetz Graefe, I. Petrov, T. Ivanov, V. Marinov
  • Data-Intensive Systems on Evolving Memory Hierarchies. In Proc. EEbS 2012
    I. Petrov, D. Bausch, R. Gottstein, A. Buchmann
  • Elasticity in cloud databases and their query processing, In JIDWM 2012
    G. Graefe, A. Nica, K. Stolze, T. Neumann, T. Eavis, I. Petrov, E. Pourabbas, D. Fekete
  • Making cost-based query optimization asymmetry-aware. DaMoN 2012: 24-32
    D. Bausch, I. Petrov, A. Buchmann
  • 2013. Read optimisations for append storage on flash. In Proceedings of the 17th International Database Engineering & Applications Symposium (IDEAS '13]
    R. Gottstein, I. Petrov, and A. Buchmann
  • Append Storage in Multi-Version Databases on Flash. BNCOD 2013: 62-76
    R. Gottstein, I. Petrov, A. Buchmann
  • FBARC: I/O Asymmetry Aware Buffer Replacement Strategy. ADMS@VLDB 2013: 58-69
    P. Dubs, I. Petrov, R. Gottstein, A. Buchmann
  • Multi-Version Databases on Flash: Append Storage and Access Paths. In International Journal On Advances in Software, Vol. 6, Number 3 and 4 2013.
    R. Gottstein, I. Petrov, A. Buchmann
  • NoFTL: Database Systems on FTL-less Flash Storage . PVLDB 6(12]: 1278-1281 (2013)
    S. Hardock, I. Petrov, R. Gottstein, A. Buchmann
  • MV-IDX: indexing in multiversion databases. IDEAS 2014: 142-148
    R. Gottstein, R. Goyal, S. Hardock, I. Petrov, A. Buchmann
    (See online at https://doi.org/10.1145/2628194.2628911)
  • SIAS-V in Action: Snapshot Isolation Append Storage - Vectors on Flash. EDBT 2014: 624-627
    R. Gottstein, T. Peter, I. Petrov, A. Buchmann
  • DBMS on modern storage hardware. ICDE 2015: 1545-1548
    I. Petrov, R. Gottstein, S. Hardock
    (See online at https://doi.org/10.1109/ICDE.2015.7113423)
  • NoFTL for Real: Databases on Real Native Flash Storage. EDBT 2015: 517-520
    S. Hardock, I. Petrov, R. Gottstein, A. Buchmann
 
 

Additional Information

Textvergrößerung und Kontrastanpassung