Design Article

Making an embedded system safe and secure

Chris Turner

6/8/2009 7:56 AM EDT

Designing safe and secure = embedded systems for medical, aerospace and other critical applications requires attention to a portfolio of techniques and makes demands on processor architecture.

Many products with embedded electronic systems demand high reliability and high security, such that they can be trusted to operate in safety-critical applications. Safe and secure system engineering requires designers to consider all the consequences of errors and intrusions in their systems in addition to normal modes of operation.

Reliability and security requirements are increasing in many of today’s application sectors such as medical devices, wireless communications, automotive sensors and controls, aerospace and some defence and security systems. However, each of these sectors has differing regulatory practices and will accept different design techniques and validation methodologies.

One set of design techniques deals with fault tolerance where a system must continue to operate, or at least be guaranteed to fail safely, in the presence of either a transient or a permanent fault. There are scaleable solutions here because at one extreme there can be complete dual, or even, triple redundancy achieved through replicating the entire system together with its power supplies.

More often, the system is allowed to cease operation in the event of a major outage so long as it is guaranteed to fail in a safe way. Consider, for example, a drug delivery system when its software crashes; if it stops working it can simply cause a battery-powered alarm to alert the user. Under no circumstances should it continue dispensing, leading to a potential overdose.

Many products cannot justify the increased cost and complexity of an additional redundant system. Therefore, an embedded processor that runs application software must apply more affordable techniques as follows. Firstly, it is important that the system continues to operate correctly in the presence of glitches or, so called, single event upsets. These may be caused by electrical noise, electrostatic discharge or the ionising effect of radiation passing through transistors in a logic register or memory bit. Whatever the cause, various techniques can be used by hardware designers to detect and then either correct the error, or shut down the system safely. Such techniques include:

* Majority voting logic applied to the outputs of redundant registers that are inserted throughout the logic design

* Running two processors in ‘lock step’ and continuously checking they produce the same results

* Applying redundant error-correcting code to memories such as hamming code or at least a simple parity bit

* Adding additional ‘watch dog’ circuits that continuously check and monitor the system.

Many of these techniques are well known and can be used by designers according to the likely failure mechanisms, the desired degree of protection and any specific regulatory requirements. They are frequently used in radiation-hardened devices for aerospace. Of course, the errors may occur only rarely throughout the lifetime of a deployed population of systems, or they could be endemic under certain conditions such as extremes of temperature or radiation.

Meanwhile, across in the universe of software, programmers are writing code that will run on the processors in these embedded systems. These programs should be thoroughly tested with a wide range of input data and execution flows to ensure code is bug free and behaving correctly under all circumstances. Formal methods can be applied to verify there will not be any run-time errors. But, programmers can only assume that the underlying processor and memory system will faithfully execute the tested code.

Combining software and hardware fault detection techniques can offer a powerful solution that brings together hardware and software verification. For example, a hardware timer could make an interrupt every second that causes a checker routine to run and observe the current state of the main program and the hardware inputs and outputs. This would test if both hardware and software continue to function correctly.

A more sophisticated extension of this concept could be developed when various software programs are written for long-term deployments in embedded systems. Software source code could be annotated by programmers in a format that captures their design intent for program flow, permissible values in variables etc. This information could be extracted by a pre-processor and used by a specialised hardware monitor. This has the advantage of carrying the programmer’s original intent for system behavior into the field over the lifetime of the deployed systems. In addition to detecting errors caused by unexpected system behavior, this offers the additional benefit of detecting errors introduced by program maintenance or application of the system in a manner unforeseen by its designer.





Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form