Project Details
Flexible Error Handling for Embedded Real-Time Systems
Applicant
Professor Dr. Peter Marwedel
Subject Area
Computer Architecture, Embedded and Massively Parallel Systems
Term
from 2010 to 2016
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 182064811
Recent developments in semiconductors will lead to increasingly unreliable hardware. In the first two project phases, we have shown that software-based error handling is an effective and efficient approach to build reliable embedded systems. The basic idea is to classify errors with respect to their worst-case impact on the execution using application knowledge. Accordingly, in situations with tight resource conditions, only the most relevant errors have to be corrected. Using reliability annotations, we have shown that by only correcting critical errors, crashes can be avoided. This approach has been extended in the second phase. Extensions comprise fault models for an H.264 decoder and improved methods for fault injection, in particular for ARM-based systems, which are in widespread use in embedded systems. Fault resilience of programs against hardware faults was studied in more detail and the exploitation of source code annotations was extended. As a result, only 33% to 50% of the data memory used in an H.264 video application needs to be reliable. Information about reliability requirements of variables is now being made accessible via a run-time library. At run-time, our subscriber model allows us to identify processes using a particular area of memory. Additional analyses concerned our Reliable Computing Base model.A number of constraints regarding fault-tolerance of mono-processors can be surmounted with multi-processors. Our target is to exploit the huge potential of multi-processors to a larger extent in the third project phase. Finding a good balance between different metrics such as reliability, timeliness, energy efficiency, code size, thermal behavior, quality of service etc. is a major challenge. There is a large design space, even though we are focusing on off-the-shelf hardware. We aim at obtaining efficient implementations by considering trade-offs between the different metrics. We will do this by considering cross-layer information (including hardware- as well as software layers) and by exploiting trade-offs for various forms of partial or complete multiple execution and various ways of using the memory hierarchy. Due to system dynamics, many of the decisions must be taken at run-time. Towards the end of the project, optimization methods and their trade-offs will be evaluated in a quantitative fashion.Overall, we expect that our focus on software-based fault tolerance techniques for multi-processor systems will provide a major push for practically useful fault tolerance techniques for reliability- and real-time-critical embedded systems. Cross-layer considerations and a comprehensive scope of optimizations are expected to result in efficient designs.
DFG Programme
Priority Programmes
International Connection
United Kingdom
Participating Person
Professor Dr. Michael Engel