Project Details
Simulator framework for runtime and energy prediction of massively parallel message-passing programs
Applicant
Professor Dr. Gerhard Wellein
Subject Area
Computer Architecture, Embedded and Massively Parallel Systems
Hardware Systems and Architectures for Information Technology and Artificial Intelligence, Quantum Engineering Systems
Hardware Systems and Architectures for Information Technology and Artificial Intelligence, Quantum Engineering Systems
Term
since 2026
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 545776403
The subproject in the Mod4Comp research unit comprises the development of a cross-architecture simulation framework that will simulate massively parallel applications with millions of threads, taking the performance and energy properties of the hardware-software interaction into account. The simulator will be capable of reproducing the dynamics of parallel programs on current supercomputers and will allow for exploring hypothetical parallel programs on future high-performance systems. The simulation will be performed in a well-controlled environment without requiring massive resources for computations and data transfers. It will be faster than traditional approaches to MPI simulation since no code is executed on real target systems. The simulator framework will support different approaches to generate skeletons of applications developed by application-centric subprojects. We will develop a domain-specific embedded language (DSEL) for construction of application skeletons since traces do often not comprise reliable inter-process dependency information and are superimposed by many effects coming from the real system, such as system noise, variations in MPI implementations, etc. We will also devise a compact and intuitive annotation language to facilitate the semi-automatic production of application blueprints via static analysis and reduce manual refinements. Finally, traces taken from real application runs on the target hardware will be supported for blueprint generation. The project aims for the first analytic-model-based simulator that can handle application codes comprising both compute- and memory-bound numerical kernel functions. The main novelty of this simulator is its full-scale scope; it will essentially be an automated version of the analytical ab-initio white- or gray-box multilayer modeling approaches developed in other subprojects on the full hierarchy of parallel systems, including cores, chips, nodes, networks, clusters, their individual inherent bottlenecks and the interactions among them. The holistic approach in the simulator will ultimately enable model-based design-space exploration that concerns the interplay of the system’s different components and the performance and energy properties of complex parallel systems. The validation of the multilayer models and the architectural and application exploration in the simulator will be performed in close collaboration with the application- and modeling-centric subprojects. The interfaces for cross-model dependencies will be coordinated by the research unit’s commissioner of the coordination project. The validation will be performed against benchmarks and application codes running on the actual heterogeneous HPC architectures (CPUs, GPGPUs, FPGAs) and neuromorphic hardware platforms. It will be done via the measurement of time, data traffic volume, energy, and other basic and derived metrics using performance tools such as LIKWID and lo2s.
DFG Programme
Research Units
