Friedrich-Alexander-Universität Erlangen-Nürnberg  /   Technische Fakultät  /   Department Informatik

dosek: A Dependability Oriented Static Embedded Kernel

← back to Overview

dosek - Fault Avoidance Strategies

Design Decisions

The general susceptibility of an operating system to errors and SDCs is to a high degree rooted in its basic design and implementation concepts. For instance, we could show in previous work [1] that, without any dependability-oriented measures, a static OSEK-like RTOS (i.e., all resources are allocated at compile time) already exhibits a five times lower number of SDCs than a more dynamic POSIX-like RTOS (i.e., all resources are allocated at run time). This inherent robustness of a static system design is the foundation of our dependability oriented kernel design.
Design Rule 1
Use a static (OSEK-like) operating system design.

Essentially, a transient fault can lead to an error inside the kernel only if it affects either the kernel’s control or data flow. For this, it has to hit a memory cell or register that carries currently alive kernel state, such as a global variable (always alive), a return address on the stack (alive during the execution of a system call), or a bit in the status register of the CPU (alive only immediately before a conditional instruction). Intuitively, the more long-living state a kernel maintains, the more prone it is to transient faults.
Design Rule 2
Minimize the time spent in system calls and the amount of volatile state, especially of global state that is alive across system calls.

However, no kernel can provide useful services without any run-time state. So, the second point to consider is the containment and, thus, detectability of data and control-flow errors by local sanity checks. Intuitively, bit-flips in pointer variables have a much higher error range than those used in arithmetic operations; hence, they are more likely to lead to SDCs. In a nutshell, any kind of indirection at run time (through data or function pointers, index registers, return addresses, and so on) impairs the inherent robustness of the resulting system.
Design Rule 3
Avoid indirections in the code and data flow.

Based on detailed static analysis of the global control-flow graph and an enumeration of all predicted system states, a high amout of potentially error-prone redundancy can be eliminated [2]. With fine-grained interaction knowledge at hand, we can tailor the system calls more specifically to the application behavior in order to speed up the kernel execution paths. Instead of calling the generic system service, we insert a specialized service at the call site. This decoupling enables us to use the interaction knowledge for selecting the minimum necessary functionality at that point.
Design Rule 4
Exploit static system knowledge to minimize error-prone, redundant control- and data flows.

vs.

Attack surfaces of different scheduling algorithms. Left: Based on sorted linked list. Right: Direct object access, unrolled scheduling operation.

References

[1]

Hoffmann, Martin ; Borchert, Christoph ; Dietrich, Christian ; Schirmeier, Horst ; Kapitza, Rüdiger ; Spinczyk, Olaf ; Lohmann, Daniel:
Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs.
In: IEEE Computer Society (Ed.) : Proceedings of the 17th IEEE International Symposium on Object/Component/Service-oriented Real-time Distributed Computing (ISORC '14)
(IEEE International Symposium on Object/Component/Service-oriented Real-time Distributed Computing, Reno, NV, USA, June 2014).
2014, pp 230-237.
Keywords: DanceOS, dosek, osek, dependability, static system
[doi>10.1109/ISORC.2014.26] (BibTeX)

[2]

Dietrich, Christian ; Hoffmann, Martin ; Lohmann, Daniel:
Cross-Kernel Control-Flow-Graph Analysis for Event-Driven Real-Time Systems.
In: ACM (Ed.) : Proceedings of the 16th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
(The 16th Conference on Languages, Compilers and Tools for Embedded Systems (LCTES 2015), Portland, Oregon, USA, June 2015).
New York, NY, USA : ACM Press, 2015, pp 1-10.
Keywords: Static Analysis; Control-Flow Graph; Cross-Kernel Analysis; Real-Time Systems; Optimization; Compiler
[doi>10.1145/2670529.2754963] (BibTeX)