Project Details
LeanMICS: Learning-Based Cross-Layer Reliability Management in Embedded Mixed-Criticality Systems
Subject Area
Computer Architecture, Embedded and Massively Parallel Systems
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 534919862
A wide range of embedded systems found in many industrial application domains, such as automotive and avionics, are evolving into Mixed-Criticality (MC) systems, where various applications in terms of assurance levels are executed onto a common platform to meet cost, space, timing, and power consumption requirements while guaranteeing a safe operation. With the technology scaling in these modern embedded platforms, which leads to exacerbating the rate of manufacturing defects and physical fault rates, the safety and reliability issues have increased tremendously in all electronic systems, from unreliable execution of MC applications to unreliable hardware. In order to design a reliable system, fault mitigation and reliability methods need to be applied in multiple system abstraction layers. However, this isolated layer-wise fault-mitigation has a high cost (in terms of power, area, and timing). Therefore, cross-layer solutions are applied to provide application-specific and low-cost faulttolerance by distributing the fault mitigation activity across the layers. However, cross-layer solutions can lead to an explosion in the design complexity due to efficient selection and configuring the reliability methods for each layer. This project (Lean-MICS) investigates the feasibility of developing a hybrid ML-based cross-layer reliability design for embedded MC systems to estimate and improve the objectives of reliability, QoS, and power consumption, at design- and run-time. ML techniques would be employed to improve dynamic reliability by adapting to the varying workloads and system conditions and determining the ideal system configuration under dynamic and environmental changes. The reliability will be first modeled for different layers, followed by cross-layer reliability, and analyzed to investigate which techniques are suitable for each layer. Then, an ML-based DSE with the goal of cross-layer reliability optimization will be proposed at design-time for MC systems. A dynamic ML-based cross-layer reliability management framework will be then presented and investigated to adapt to different system configurations and dynamically manage resources. The overhead of these techniques will also be investigated and accounted for during the analysis.
DFG Programme
Research Grants