Project Details
Projekt Print View

Compiler Optimizations for RTM-based computing systems

Subject Area Software Engineering and Programming Languages
Computer Architecture, Embedded and Massively Parallel Systems
Term since 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 450944241
 
Computing systems are undergoing an incredible evolution since the end of Denard scaling and in the face of the current limitations of CMOS technologies. In addition to new computing paradigms, several new memory technologies are being proposed to replace or augment traditional random access memories (RAM). Among them, racetrack memories (RTMs) are an exciting non-volatile memory technology that promises the density of hard-disk drives with the a latency somewhere between static (SRAM) and dynamic RAM (DRAM). A fundamental difference of RTMs is that they store multiple bits sequentially per access transistor, as opposed to one bit in SRAM and DRAM. This makes the latency and energy needed to access data dependent on where the bits are located in the sequential bit stream, creating a new kind of spatial locality where the distance between memory offsets must be minimized to improve performance and save energy. While compilers have targeted temporal and spatial locality in the classical sense, there is not established theory or algorithms to handle the sequential nature of RTMs. This project proposes novel compiler analysis and optimizations for RTM-based computing systems, focusing on the concrete case of nested loop programs from the domains of linear algebra, machine learning and physics simulations. We propose extensions to polyhedral compilers to analyze profitable memory access patterns and transform the program by changing the data layout and the operation schedule. The main goal of these transformations is to produce a semantic-preserving memory access trace where the distances between consecutive accesses are minimized. We then leverage the higher-level semantics in domain-specific languages (DSLs) for tensor expressions, which nicely map to nested loop programs. DSLs offer more degrees of freedom for optimization, since the data layout can be more freely chosen and known algebraic properties of operators enable coarser-grained transformations. Optimizations in this project will target not only performance and energy consumption, but also the interesting trade-off between these standard metrics and capacity offered by RTMs. We expect this project to lay the groundwork for future compilers for RTM-based systems and and provide valuable system-level feedback to computer architects and perhaps material scientists.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung