Break Points
Code: Getting it right
Jack Ganssle
10/26/2009 2:28 PM EDT
Most of us have no quantitative idea. A few outfits measure bug rates, but, as been often observed, an apparent absence of bugs does not mean the code lacks defects.
While safety-critical code is often better-tested, and sometimes better-designed, than average industrial software, testing cannot prove the absence of bugs. According to .Adams, N.E., "Optimizing preventive service of software product," IBM Journal of Research and Development, 28(1), p. 2-14, a third of all software faults take more than 5000 execution-years to manifest themselves. That's encouraging for those of us not planning to challenge Methuselah, but graphically illustrates the inadequacy of testing.
Some operating systems are certified to extremely-high levels of reliability and/or security. To my knowledge none have been completely proven using the most rigorous formal methods (some, like Green Hills' Integrity, certified to EAL6+ against a particular protection profile, come close). Until now. According to a news item on their site, NICTA's Secure Embedded L4 microkernel has been completely proven correct using formal methods. That's quite an achievement.
But more interesting is the stunning cost required to make the proofs. The OS comprises 9,300 lines of code, of which 7,500 have been verified. The cost of verification: 25 to 30 person years of work! At a loaded cost of $200k/person year, that's over $5m, or about $700 per line of code. Since most firmware runs $20-40 per line of code (this covers the entire cost of the project), formal verification is twenty times the cost of writing the stuff in the first place.
Actually, $700/LOC is cheaper than one might think. Apparently, Green Hills spent about $1000/LOC to achieve their EAL6+ validation, which is not as strong as a proof as what NICTA claims. NICTA themselves believe EAL6 certification would run about $10,000/LOC. That's quite a range; even at the low end it's gobs of cash.
They found 144 bugs while formally proving the OS. That's about a 2% rate, which suggests very little testing was used prior to the proof.
So- how good is your code? How do you insure it's close to being right?
(Update: Green Hills tells me INTEGRITY was evaluated using formal methods, so my statement that their validation is not as strong as a proof as what NICTA claims is incorrect.)
(Editor's Note: Jack's Embedded Poll question this week is "How do you do bug metrics?" To vote, go to the Embedded.com Home Page.)
Jack G. Ganssle is a lecturer and consultant on embedded development issues. He conducts seminars on embedded systems and helps companies with their embedded challenges. Contact him at jack@ganssle.com. His website is www.ganssle.com.





Dr Gernot
10/26/2009 10:44 PM EDT
Hi Jack,
The cost of formal verification isn't astronomical at all. Much of the cost of the project was the cost of doing it for the first time.
As explained in the detailed writeup of the NICTA project (see http://ertos.nicta.com.au/publications/papers/Klein_EHACDEEKNSTW_09.abstract), the cost of re-doing it (kernel plus proof) is estimated at 8py, which is only around twice the cost estimate produced by SLOCcount (which is in line with the experience at other places for similar systems). The SLOCcount cost estimate, of course, assumes traditional Q&A, which would be expected to lead to a defect density of around 2-5 per kLOC.
Sign in to Reply
A.Bottemanne
10/28/2009 2:39 AM EDT
Hi Jack,
Good article. Indeed, making sure code is correct is a big question mark for many people. Yet a very important one. Bad design and coding has not killed as many people as badly constructed bridges, airplanes, cars etc have, yet therefore no (legally enforced) design verification takes place in (embedded) Software.
Formal verification by hand is, as you describe, a lengthy process. But that is not necessary. People who use ASD:Suite (from Verum: www.verum.com ) have a tool that enables the Software Architects and designers rapidly build, formally verify and automatically generate software for complex systems.
This patented software guarantees defect-free software. Companies like Philips (Healthcare / Medical Systems), NXP, Nandatech and many other use it already as defect-free software is very important to their customers.
Interestingly: they find that the total cost of development drops by 30% , where one would expect it to be more expensive given the defect-free guarantee. After-sales costs ('non-quality' cost) drops by 90%.
Take a look at: http://www.verum.com/technology/ to get an idea about the concept and it's abilities.
Sign in to Reply
jmdavid1789
10/28/2009 9:29 AM EDT
How good is my code?
My code is good until I find problems!
However, I do everything beforehand to prevent them (design, coding rules, code review...).
Sign in to Reply
Scottish Martin
10/28/2009 4:52 PM EDT
You are standing on the roof of high building - a rope hangs over the side - this is the only way down.
A Sales Manager and a leading academic approach you. The Sales Manager has the latest tool for formally proving the safety of the rope. The academic advises the applicability of formal methods have been oversold for decades.
You are given the option of testing the rope with a resentative weight, or proving its safety using predicate calculus.
See you on the ground?
Sign in to Reply
A.Bottemanne
10/30/2009 3:36 AM EDT
No, we would stay on the roof actually.
Like in many Industries, in several even mandated by law, tools are used and have proven to be of value.
You may ask yourself:
- How many aircraft are still being designed & build without a tool ?
- How many bridges are still being designed & build without a tool ?
- how many cars are being designed & build without a tool ?
- etc.
The point is: if we (the IT industry) want to take (software) development as serious as engineers of bridges, aricraft etc, it becomes time to use the tools appropriate to do so. For them it is CadCam systems, in Software design & engineering it can be our Tool. On our website you will find several papers from successful customer projects.
(eventhough my present role is on the commercial side of things Martin, I have been in soiftware development and design for more then 14 years)
Next week the FMweek is being held. This year in Eindhoven. If you want to come and see us, you are more then welcome. You may ask any question, and we will show you (live) how it works (and that it works). Form your own opinion, based on your experience with us.
More information: http://www.win.tue.nl/fmweek/
Sign in to Reply
RidgeRat
10/30/2009 7:05 PM EDT
Not to deny the power of formal methods... but having run into numerous cases of coding that met design specifications perfectly, yet did things that were astoundingly bizarre... who is going to mathematically prove that the specification was "right" in the first place?
Sign in to Reply
A.Bottemanne
11/2/2009 1:41 AM EST
One of the main benefits for using our toolset (over Rational for example), is that the specifications needed to set it up right becomes a process, that does not leave room for ambiguity.
That is part of what we call: the magic triangle. Not just do the model check, but check if the model is right (with respect to the specifications) in the first place !
(does the program do, what it is supposed to do)
As mentioned, test drive us, no strings attached, at FMweek to see for yourself.
Sign in to Reply
ECS_Shadow
11/12/2009 11:14 AM EST
Getting it Right, for who or what? Q&A? IV&V? Requirements? Specifications? Formal Methods?
We need to be Getting it Right for the USERS! (It their lives that are at risk!)
Most of our embedded S/W is drop shipped. (User problems are user problems until they prove otherwise.) If we do respond to user problems, we are faced with the expensive and time consuming task of trying to re-create the problem, from limited and often inaccurate data. This is probably why we tend to live in our own little world disconcerted and isolated from the users. (Just look at how our software support organizations are designed to work for us and not the users.)
We will not even try to close this gap between us and the users. If we did, we would see the true reliability of our software. THIS IS THE FUNDAMENTLY PROBLEM STANDING IN THE WAY OF RELIABLE EMBEDDED SOFTWARE!
If you think embedded software reliability is improving, dont look at what the software development environment is (or is not) putting in the executable file. You could see a memory++; line of code that translates into:
xxxxxxxx0: xx xx xx xx ld.w r8,pc[540]
xxxxxxxx4: xx xx ld.w r8,r8[0x0]
xxxxxxxx6: xx xx ff ff sub r9,r8,-1
xxxxxxxxa: xx xx xx xx ld.w r8,pc[530]
xxxxxxxxe: xx xx st.w r8[0x0],r9
If you think this is strange, you should see what the memcpy function (from the development environment) is doing!
Sign in to Reply