State-of-the art solutions
for high reliability systems make use of one or more
of the following approaches.

At software level
It is based on sw redundancy, as for instance n-Version
Programming and recovery block.
Many drawbacks - Performance
degradation, sw overhead, higher detection latency,
strong application dependency and higher effort to achieve
iec61508 compliance for each application.

At system level
It is based on mcu redundancy. In this case a certain
number of mcus, typically two or three depending if
fault-tolerance is required, are used in the same system,
with comparators or with mutual check.
Many drawbacks - High cost at system level
for hw overhead, packaging and pcb, system dependency.

At MCU level
It is based on CPU redundancy. It can be either symmetric,
with comparators or with mutual checks; or asymmetric,
where a smaller CPU or watchdogs.
Many drawbacks - Symmetric solutions (such
as lock-step or dual core architectures) lack the diversity
required by iec61508 and the overheads (gate count,
performance and power) rapidly grow beyond practicality
in the attempt to apply these concepts to high-performance
cores. Asymmetric solutions are mainly based on watchdogs
that suffer from low diagnostic coverage and thus require
a complex SW infrastructure to overcome this limitation.
Therefore, they are mainly used for low SIL systems.

At gate level
It can be used logic redundancy for instance using concurrent
checkers in alu, or modifying the pipeline with ecc
codes.
Many drawbacks - Specific
cpu redesign, performance overhead (timing), diagnostic
mixed with safety function (not recommended by iec61508).

At transistor level
It can be used a particular process or layout techniques
to harden the technology against errors, such for instance
to design srams with dram architecture to make them
less prone to soft errors.
Many drawbacks - Specific to certain types
of faults, very high cost and overheads.
|