Design Article

Finding defects using Holzmann's "Power of 10" rules for writing safety critical code

Paul Anderson

4/14/2009 5:45 PM EDT

In safety-critical applications, bugs in software are not just costly distractions—they can put lives at risk. Consequently, safety-critical software developers go to great lengths to detect and fix bugs before they can make it into fielded systems.

Although there are some well-known cases where software defects have caused disastrous failures, the record is mostly fairly good—if the software controlling medical devices or flight-control systems was as buggy as most software, the headlines would be truly dreadful.

The methods that safety-critical developers use are undeniably effective at reducing risk, so there are lessons to be learned for developers who do not write safety-critical code. Two techniques stand out as being most responsible: advanced static analysis and rigorous testing.

Static analysis tools have been used for decades. Their appeal is clear: they can find problems in software without actually executing it. This contrasts with dynamic analysis techniques (i.e. traditional testing), which usually rely on running the code against a large set of test cases. The first generation of static-analysis tools, of which lint is the most widely-known example, were quite limited in capability and suffered from serious usability problems.

However, recently a new generation of advanced static-analysis tools has emerged. These are capable of finding serious software errors such as buffer overruns, race conditions, null pointer dereferences and resource leaks.

They can also find subtle inconsistencies such as redundant conditions, useless assignments and unreachable code. These correlate well with real bugs as they often indicate that the programmer misunderstood some important aspect of the code.

The tenth rule
Using advanced static analysis tools is quickly becoming best practice: rule ten of NASA/JPL's Gerard Holzmann's "Ten Rules for Writing Safety Critical Code" specifies that advanced static analysis tools should be used aggressively all through the development process.

The other important technique is systematic testing. The importance of highly rigorous testing has been recognized by some regulatory agencies. For flight-control software, the Federal Aviation Authority is very specific about the level of testing required.

The developer must demonstrate that they have test cases that achieve full coverage of the code. Developing such test cases can be very expensive. Advanced static-analysis tools can help reduce this cost by pointing out parts of the code that make it difficult or even impossible to achieve full coverage.

Benefits of advanced static analysis
Testing has traditionally been the most effective way to find defects in code. The best test cases feed in as many combinations of inputs and conditions as possible such that all parts of the code are exercised thoroughly. Statement coverage tools can help you develop a test suite that makes sure that every line of code is executed at least once.

But as all programmers know, just because a statement executes correctly once does not mean it will always do so—it may trigger an error only under a very unusual set of circumstances. There are tools that will measure condition coverage and even path coverage, and these are all helpful for exercising these corner cases, but achieving full coverage for non-trivial programs is extraordinarily time-consuming and expensive.

This is where advanced static analysis shines. The tools examine paths and consider conditions and program states in the abstract. By doing so, they can achieve much higher coverage of your code than is usually feasible with testing. Best of all, they do all this without requiring you to write any test cases.

This is the most significant way in which static analysis reduces the cost of testing. The cheapest bug is the one you find earliest. Because static analysis is a compile-time process, it can find bugs before you even finish writing the program. This is usually less expensive than if you have to find them by writing a test case or debugging a crash. This article also describes how these tools work, and then shows how they can also reduce the cost of creating test cases.


Next:




ianb1469

4/16/2009 8:21 AM EDT

We certainly have found static analysis to be very effective at preventing functional bugs hitting us, although using the right development language (Ada in our case) was a good way to help facilitate static analysis.

It is interesting to see how the combination of static analysis of the source code and dynamic analysis/testing can interact. This is often the best way to analyse non-functional properties such as execution time and performance issues.

For example, it's hard to "guess" a test that exercises the worst case execution time, but with tool support you can take timing measurements from existing tests and use static analysis to predict the timing of longer execution paths that you have not tested directly.

Ian
Rapita Systems Ltd - Software Timing Solutions
http://www.rapitasystems.com/

Sign in to Reply



ianb1469

4/16/2009 8:31 AM EDT

We certainly have found static analysis to be very effective at preventing functional bugs hitting us, although using the right development language (Ada in our case) was a good way to help facilitate static analysis.

It is interesting to see how the combination of static analysis of the source code and dynamic analysis/testing can interact. This is often the best way to analyse non-functional properties such as execution time and performance issues.

For example, it's hard to "guess" a test that exercises the worst case execution time, but with tool support you can take timing measurements from existing tests and use static analysis to predict the timing of longer execution paths that you have not tested directly.

Ian
Rapita Systems Ltd - Software Timing Solutions
http://www.rapitasystems.com/

Sign in to Reply



DickH

4/21/2009 5:39 PM EDT

Want to write good code?

RETIRE C.

Write in some other language.

Sign in to Reply



talkaboutquality

4/13/2010 4:51 AM EDT

A rhyming issue.

Regarding Figure 5, where it says

The error is on line 30 and is clearly a typo: The programmer missed the final "X".

I suppose so, but equally, the error is on line 33 where the programmer missed the final "ELSE"!

Sign in to Reply



cwlh

11/17/2010 4:14 PM EST

I think that there's one other factor that needs to be considered.

In any reasonably complex piece of software (and even something as simple as multi-threading makes a piece of code complex) we have no choice. Since states cannot be enumerated, demonstrating correctness by testing is not even a possibility: static analysis is the only tool available to us.

Retrospectively producing a design from the code and using a concurrency checker such as SPIN is the *only* way of demonstrating correctness. If concurrency problems (Heisenbugs) occur during testing then the test cannot provide any useful debugging information by definition: the problem is non-reproducible. "Thorough" testing means putting the system into a series of predefined states and then poking it to ensure that the state transitions are correct. But in any concurrent system, we can't even enumerate the possible states, let alone force the system into a particular one.

Performing deep static analysis of the code using symbolic execution or similar techniques is not just a nice-to-have, it's essential.

Sign in to Reply



Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form