Design Article

Six Rules for Writing Clean Code

Matt Gordon

6/8/2009 2:57 AM EDT

As embedded systems developers, many of us are veterans of multiple programming courses. In college, we received instruction on a vast assortment of different algorithms and data structures.

Most of us were also taught the semantics of at least one high-level programming language. There are surprisingly few engineers, though, who have received significant instruction on the fundamental topic of coding style.

Programming courses teach what can and can't be done with a particular language, but they often fail to demonstrate what should be done in order to produce clear-cut, manageable code.

The exclusion of this topic from programming curricula does not indicate that coding style is inconsequential. To the contrary, the style in which a program is written affects how that code functions, and determines whether or not it can be easily supported.

Developers who have adopted an effective style produce software that other programmers can readily understand and update. On the other hand, developers who lack such a style turn out code that causes innumerable headaches for the engineers who are responsible for its maintenance.

What actually constitutes an effective coding style? Six general rules for writing clean code are provided below. These guidelines will help you steer clear of the problems that poor coding style creates.

Rule #1. Use pithy comments
When other engineers attempt to discern how your code works, they'll likely turn to your comments. Thus, you should make sure that these comments form an accurate description of the code. You should also avoid stating what would be obvious to another engineer. The first of the below C comments provides its reader with useful information, but the second is frivolous. (Constants are hard-coded in the below code and elsewhere in this article. You should normally use #define constants in place of such hard-coded values.)

while (usb_pkt_rdy == 0)  {   /* Wait until at least one packet has been received */

x = 0;                                     /* Set x to 0                                                                  */

Rule #2: Assign intuitive names to variables and functions
You should use names to indicate functionality. The name filter_coeff, for instance, would be appropriate for a variable that is a filter coefficient. In the commenting example above, the name x reveals little about what that variable does.

Rule #3: Make ample use of white space
To a C compiler, white space simply delimits tokens. Accordingly, extra spaces and newlines in your application may seem to be of small importance.

In C and many other high-level languages, though, white space can help other developers to more easily understand your code. Although the two loops shown below are identical from the perspective of a compiler, the functionality of the second loop is clearer to human readers of the code. (Importantly, the Tab key was not used to indent the second loop. Since the Tab character does not expand identically on all machines, it should be avoided.)

Rule #4: Don't make your code unnecessarily complex
A single line of C code can have multiple side effects. Typically, though, a lengthy line of code that updates multiple variables is more difficult to understand than multiple, simple lines of code that accomplish this same objective.

Thus, you should try to limit the number of actions performed by each line of your code. You should likewise strive for simplification when declaring functions; excessively lengthy routines should be broken into multiple parts.

For the compiler, multiple statements on one line or single statement on multiple lines will produce the same code density. As it is easier for the reader to have one statement per line, this is the recommendation, which has no impact on code efficiency.

Rule #5: Be explicit
So that other developers will be able to quickly discern what each line of your code does, you should eschew syntax that obscures meaning. For example, in comparison statements that involve zero, you should avoid the syntax shown in the first of the below if statements.

        if (!tx_val) {
            err_cnt++;
        }

Instead, your code should resemble the second if statement below. This statement clearly indicates that the variable tx_val is being compared to zero.

        if (tx_val == 0) {
            err_cnt++;
        }

Rule #6. Apply rules uniformly
Simply put, clean code is consistent code. To make sure that your style is consistent, you should use a coding standard, a formal document that elaborates on the type of rules that this article summarizes.

For an example coding standard, you can visit the Web site of embedded software provider Micriµm. There, you can download the standard that guides the efforts of each of Micriµm's software developers. This coding standard is available free of charge.

Matt Gordon, Technical Marketing Engineer, at www.micrium.com has several years of experience as an embedded software engineer, and is responsible for explaining the nuances of this the company's software to Micriµm's partners and customers. Matt holds a bachelor's degree in computer engineering from Georgia Tech.





kaila

6/8/2009 5:58 AM EDT

A comment to Rule #5: Be explicit.

Instead of 'if (tx_val == 0) {' I would write the following expression: 'if (0 == tx_val) {'

In case of forgotten '=' (0 = tx_val) the compiler is not able to assign a value to a constant. Result: Error! The expression (tx_val = 0) is valid!!!

Kaila

Sign in to Reply



tulsaGuy

6/9/2009 12:51 PM EDT

I highly recommend a reading of the classic "Code Complete" by Steve McConnell for a great guide on writing clean code. The text is more Golden.

Sign in to Reply



jmdavid1789

6/9/2009 1:46 PM EDT

Rule #5 has a limitation if one is coding close to natural language. Example:
if( !error ) means "if there is no error"
if( error == EOK ) means "if there is an error which is the no error case"

By analogy, do you ever ask "Is there milk in the fridge" or "Is there a bottle of milk in the fridge which is empty"?
The if( !milk_in_the_fridge) warns us quickly that there is a problem (i.e. no milk for my tea). "milk_in_the_fridge" is not necessarily
a Boolean, it can be the level of milk in the bottle (0 to 100%).

Sign in to Reply



Alex OD.

6/9/2009 3:30 PM EDT

Rule #4.

If too much happens on one line, then it is hard to set a break point, especially in an "if then else" scenario. Sure you can turn on the dissasembler function in your debugger, but optimised code sometimes looks very different.

I learned this the hard way from some inherited code and the Freescale Codewarrior toolchain.

Sign in to Reply



kishorekanala

6/11/2009 7:38 AM EDT

coding style should be such that when we are writing variable or a structure name, we should write the name without referring header file. This will avoid wastage of time searching for a structure declaration and variable definition.

Sign in to Reply



MarkVZ

6/11/2009 4:34 PM EDT

There only needs to be one rule - always write all code as though someone else (or you many years later) will have to easily understand/update the code without any other supporting information/documentation.
Mandatory peer reviews usually result in clean code - nobody wants to be embarrassed by their peers.

Sign in to Reply



MostafaKassem

6/12/2009 11:15 AM EDT

A follow up to what MarkVZ stated, one should live by this rule: "the code is written once, but read many times".

Sign in to Reply



Rzaman

7/5/2009 7:16 PM EDT

I also follow the way explained by kaila.

Sign in to Reply



ericshufro

1/18/2010 8:38 PM EST

I couldnt agree move with all of the state rules. Although, it is true that many people prefer if (0 == variable) syntax because it generates an error if a single '=' is missing, however, I find the syntax less comfortable to read. To deal with this shortcomming, I always ensure that my application compiles with 0 errors and 0 warnings. Forgetting a '=' will generate a warning on most compilers.

--Eric

Sign in to Reply



willc2010

1/19/2010 12:40 AM EST

I sympathise with Eric and Kaila, but it is staggering that in 2010 we still need to debate such an absurd and ridiculous issue as whether to write 'if (variable == 0)' or 'if (0 == variable)' in order to protect against the easy slip of leaving out an '=' and thereby turning the equality test into an assignment. Booleans should not be the same as integers. Assignment should not be merely an expression with a side-effect. Those choices were made in order to simplify the implementation of the compiler, not because they are sensible for programming. This stuff was already primitive in the early 1970s.

Sign in to Reply



mili1985

1/19/2010 5:12 AM EST

Rule #5 has a limitation if one is coding close to natural language. Example:
if( !error ) means "if there is no error"
if( error == EOK ) means "if there is an error which is the no error case"
http://video-editor-for-mac.mp4kits.com/vob-editor-mac/
By analogy, do you ever ask "Is there milk in the fridge" or "Is there a bottle of milk in the fridge which is empty"?
The if( !milk_in_the_fridge) warns us quickly that there is a problem (i.e. no milk for my tea). "milk_in_the_fridge" is not necessarily
a Boolean, it can be the level of milk in the bottle (0 to 100%).

Sign in to Reply



Lundin

1/19/2010 10:16 AM EST

Indeed we shouldn't have this discussion in year 2010. If you write

if(0 == something)

it only shows that you have no clue about what static code analyzers are, or even what Lint is.

It also shows that you are used to write programs in compilers worse than ancient Turbo C from 1990, which warned against "possible incorrect assignment". All compilers since then have done the same.

Sign in to Reply



willc2010

1/19/2010 7:42 PM EST

A C compiler now will generate warnings about 'possible incorrect assignment' or some such under many of these circumstances (assuming that the warnings are on), and as long as you don't overlook the warning then it gives you some sort of protection. But whether to produce this warning at all and under what circumstances is up to the authors of the compiler. It is still legal C and it can still compile.

The real problem is that 'if (integer-assignment-by-side-effect) {...}' should be illegal always by definition and should never compile. It is an example of the fundamentally bad design of the C language itself. You simply could not make this mistake with Ada, for example.

For an example of static code analysis that actually represents progress and is not just an attempt to paper over the gross defects of an ill-conceived language family, look at Spark Ada.

Sign in to Reply



Lundin

1/20/2010 2:57 AM EST

SPARK Ada is a limited subset of the unsafe Ada language, just as MISRA-C is a limited subset of the unsafe C language. Both these subsets enforce static code analysis. SPARK Ada and MISRA-C are considered to be the only valid programming languages for safety-critical applications according to the expertise. In order to achieve SIL 3 / SIL 4 in IEC 61508, you'll have to use a limited subset language.

So don't start throwing rocks at C just yet, there is hope for it (just as there is hope for Ada) if you only use a recognized limited subset.

Even if you aren't writing safety-critical applications, know that the difference between them and normal ones, is that safety-critical standards enforce high quality programs. For normal programs, people have for some reason adopted the policy "yeah it would be somewhat nice if the software works as expected sometimes".

We shouldn't tolerate bugs no matter what kind of application we write, that's why I'd like to recommend MISRA-C.

MISRA homepage: http://www.misra-c2.com/

Sign in to Reply



Lundin

1/20/2010 2:59 AM EST

Btw MISRA-C bans assignment inside control statements entirely.

Sign in to Reply



willc2010

1/20/2010 8:38 PM EST

When we do have to use C here, we use MISRA-C, and I agree that it is significantly better than raw C. However it still has a lot of problems. This is a direct quote from section 1.3 of the MISRA-C guidelines document (1998):

"Nonetheless, it should be recognised that there are other languages which are in general better suited to safety-related systems, having (for example) fewer insecurities and better type checking. Examples of languages generally recognised to be more suitable than C are Ada and Modula 2. If such languages could be available for a proposed system then their use should be seriously considered in preference to C."

Having written and managed the development of large programs in C family languages and Ada, there is no doubt at all to me that Ada is far less error-prone and far superior for long term maintenance.

The Ada package system is so much better than the C include file hack that nobody who is familiar with the former could possibly regard the latter as anything but ugly and dangerous.

For another example, in C there are typically six integer types with ranges -128..127, 0..255, -32768..32767, 0..65535, -2147483648..2147483647 and 0..4294967295. Only in special cases will these ranges happen to match those in a problem domain. In Ada I can define

type Pressure is range 10..50;
type Temperature is range -50..100;

which are appropriate for the application. The compiler can use this information, and I can also be reasonably confident that nobody in the team will accidentally assign a pressure to a temperature variable, or pass a temperature to a procedure that is expecting another integer type and so on, because the program will not compile. They would have to deliberately perform an explicit conversion. In C and its relatives it is frighteningly easy to make these kinds of mistakes. The Ada system is not perfect, but it's a whole lot better.

There are any number of other examples, e.g. the fact that Ada has proper enumerations, or the Ada case statement versus the C switch statement. I find it much easier to read Ada than C too, and so does everybody else that I work with.

Ada is not perfectly safe, but it is a lot safer than C, or any C family language. Similarly Spark is not perfectly safe, but it is much safer than MISRA-C. It is true that sometimes you just have no choice because C (or preferably MISRA-C) is all that you can get, but if enough of us face the fact that C is ugly, unsafe and primitive and increase the demand for something better, then maybe we'll get it.

Sign in to Reply



RidgeRat

1/26/2010 8:05 PM EST

About "rule 1": the example is pretty good but NOT because it is "pithy." A better statement of this rule would be: "Comment why, not what." You don't need an accurate description of the code -- decent code describes perfectly well what it is doing. It just can't explain why it is doing it. In your example, the comment suggests that variable usb_pkt_rdy is used as a logical sequence control flag, which might not be immediately clear given that the variable has an integer value. The comment fills in this missing context.

Sign in to Reply



Software Bill

1/28/2010 4:51 PM EST

willc2010

Everything you say about C (and C++) is both true and irrelevent. Embedded programming is caught in a catch 22 situation. Develpers don't know Ada, or Modula 2 because most companys don't want it, and companys don't want it because it's much easier (that means cheaper) to find C anc C++ programmers. I have never used Ada, but I did use Modula 2 when it first came out, and much prefered it to C. Unfortunatly that project got canceled, and I have never seen a requirement for a Modula-2 programmer. Maybe in the future some group (IBM? the FEDs?) will have the clout to force a change, but for most of us we just have to write the best C code we can.

Sign in to Reply



willc2010

1/28/2010 9:36 PM EST

Software Bill,

Maybe embedded.com, "The Official Site of the Embedded Development Community", is not the right place to discuss the merits and defects of languages used in embedded development. But articles on this site routinely contain statements like "The C programming language is ideally suited for embedded development" ("Garbage collection in C-language code applications") or that 'const' is "a strongly typed alternative to #define for numerical constants" ("Bug-killing standards for firmware coding"), and many others. If senior developers really believe things like this, and just about everybody seems to accept it, then perhaps an occasional view from outside the C bubble is not entirely irrelevant.

There are indeed a lot of C/C++ programmers, partly because C/C++ are conceptually unsophisticated and easy to pick up (writing good programs in them is another matter of course), but it is not unreasonable to expect professional programmers to educate themselves in the concepts that are reflected in (for example) the Ada language. Similarly, many development managers actually seem to believe that C/C++ are somehow advanced, appropriate and modern, and that Ada is an obsolete dinosaur. These very mistaken ideas are propagated almost without challenge in many forums.

The world of "being wary of the compiler", a predefined set of integer types with no effective type safety, mallocing strings, arrays as merely a notation for pointers, feeble enumerations, #include, and so on should not be regarded as the normal and appropriate way to write software any more.

Sign in to Reply



Lundin

1/29/2010 3:00 AM EST

I believe preaching for one language or another on various Internet sites is pretty much futile.

Individual developers and small/medium sized companies aren't going to set an industry standard for a programming language. Historically, this is only done by huge companies. The success of the C language hasn't mcuh to do with the language itself, but rather because it was created by AT&T, then promoted by Sun, then by Microsoft. The same companies have been favouring C++ as well. Modern languages like Java and C# were also created by these huge companies.

Similarly, languages like Ada and Pascal have died out because no big companies supported them. Ada would be completely dead if not some big avionics companies favoured it.

Sign in to Reply



willc2010

1/31/2010 9:03 PM EST

Well, I keep hearing that Ada has died out. Yet it is much more readily available, better supported, less expensive and generally easier to use today than it has ever been. We used to use expensive and rather cumbersome native and cross compilers on VAXes. Now you can run a much better compiler system on a PC at a fraction of the cost. The support of big avionics companies and others does no doubt contribute a lot to its ongoing vitality. But there are good reasons why they favour it. A small or medium sized company can just as easily use Ada instead of C for many applications (I speak from experience). Sure, there's sometimes some training involved, but it's not that hard, and the savings in debugging, maintenance and reduction in faults far outweigh the costs. There is no good reason why far more companies could not use a lot more Ada and a lot less C. This perception that the relatively well-designed Ada language is dead and the primitive and ad hoc C/C++ family is the way to go, which I agree is widespread, is bad for software quality at a time when software is becoming ever more pervasive. Maybe some people who are in a position to influence these decisions read forums like this. If not, then it really might be futile to discuss it here.

Sign in to Reply



0xC0FFEE

2/9/2010 7:47 PM EST

for rule #5 IMHO it's better to write the test as (0 == tx_val) so that if the second '=' is missing, it will not be turned into an assignment by the compiler.

Sign in to Reply



yyrkoon

2/28/2010 1:07 AM EST

I think "Clean style" in this case is highly subjective.

Personally, I abhor abbreviated variable names, and underscore blemishes. Really, I do not care if it is a preferred "style" or not. It is confusing at best, and extremely ugly. However, for arguments sake, lets assume any C programmer *should* be able to understand this code, then what is wrong with if (!variable){}; ? To me this naturally is spoken in English as; if not variable, then do something. Perhaps I am missing your point?

So as the Ada people above have already mentioned. If we're not "allowed" by style to use such features of a language( that we have been presented with). Then perhaps we should reach for the right tool, and use it instead ?

Sign in to Reply



Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form