Formatting Your Code: Why Style Matters

Introduction


Coding

Formatting Your Code
Why style matters

Universal Programmers Toolkit
Care and feeding of your code collection

Effective Proactive Debugging Techniques
It's all about the tools

Good Programming Practices
What to do (or not)

Banning Bad Bots
A short but effective script


Management

The Joy of Specs
How to (almost) guarantee a successful project

Habits of Successful Freelancers
Advice for success

How to Become a Great Programmer
One easy lesson!

Bidding on a Stranger's Project
The basics

Freelancing 101 - Don't Send That Email!
Pick up the phone instead

Ensuring Your Web Site Project Succeeds
Advice for clients


Photography

How to Take Great Photos (And Fix Lousy Ones), Part 1
Composing and shooting your photos

How to Take Great Photos (And Fix Lousy Ones), Part 2
Editing and postproduction

Formatting Your Code: Why Style Matters

Overview

In a previous article I mentioned that consistency is more important than formatting style. But that doesn't mean style is unimportant; quite on the contrary, the style you choose can make a huge difference.

 

Of course, the pros and cons of different styles have already been discussed in countless books, articles and forums. Rather than jumping into that religious war, I'll simply tell you which style I ascribe to and let you decide for yourself.

Indentation

Spaces suck; tabs rule

Long story short: don't use spaces to indent; instead, use tabs. A tab character represents a virtual indent. Want to indent two levels? Use two tabs. This entirely avoids the headaches of using multiple spaces to indent.

 

The problem with spaces is that nobody ever uses the same number of spaces for indentation, even within the same file. Some people use eight spaces; others use four; some use two; and still others use some crazy in-between variation. Manually adding or removing indentation is tedious and error-prone because you have to add or delete the proper number of spaces. Sure, you can configure your editor to autoindent but again, it's too easy for some parts of a file to use, say, four spaces for indentation and other parts of the same file to use, say, eight spaces for indentation.

 

These days you can find decent editors and IDEs that will reformat code to your liking. Problem is, reformatting is an extra step that can easily introduce errors, or even just aesthetically unpleasing source code. Ever had a heredoc statement screwed up by reformatting? Don't you hate it when your nicely-formatted comments, complete with carefully placed ASCII art, custom spacing, indentation and line breaks, are blown to shreds by your "smart" code formatter? Automated or regular reformatting simply isn't worth the effort; it's only worthwhile for one-shot deals to reformat ugly code you've inherited.

 

It is left as an exercise for the reader to determine how this affects languages in which whitespace (especially leading whitespace) is syntactically significant.

Comments

Where to put them

Since all languages are somewhat different, I'll leave aside the question of whether you should use enclosure-style comments (e.g. /* … */) or prefix-style comments (e.g. # or //). If your language gives you both options, so much the better; we're going to use them both.

How to write them

Your comments are perhaps the most important part of your code, and yet in practice they are often the most neglected part.

 

 

 

 

Examples

I know you're dying for examples of what I'm talking about so here you go:

 

var $dx_mnd_frm = 0;

 

This code defines a variable. What language is it written in? That doesn't matter; it's just an example. It is the minimum code required to declare a variable and initialize it to zero. The code itself is not the issue here. This code sample contains no comments. This is a bad thing. Why? Because the programmer now has to hunt down this variable in context and divine what it means and what it does.

 

So, what type of stuff should the comments say about this variable? Hint: I just gave you the answer above.

 

  1. What does "dx_mnd_frm" stand for?
  2. What is it used for? (Sometimes, but not always, that will be answered by the previous question.)

 

Let's update our code sample:

 

var $dx_mnd_frm = 0; // $dx_mnd_frm stands for "Deluxe Mind Form" and contains the number of questions answered so far in this session.

 

So, what the heck is a "Deluxe Mind Form?" Again, it doesn't matter; this is just an example of how you should explain what variables stand for and what they are used for. I'm assuming here that the hypothetical reader is familiar with my example "Deluxe Mind Form" industry, or whatever. Note, too, how I used a complete sentence ending with a period. That's critical because it lets the reader know that there is nothing further to read (no need to keep scrolling), and that the comment is indeed complete and hasn't accidentally been truncated by a fat finger or automated process. If they were to read a comment that didn't end with a period they would know immediately something was amiss. (There are exceptions to the "always end your sentences with period (or other punctuation)" rule but I'll get to them later.)

 

This is a great time to bring up the question of where you should put your comments. Should they go at the end of the line of code they are describing? Above it? Below it? Elsewhere? For example, here is another way of writing this comment:

 

// $dx_mnd_frm stands for "Deluxe Mind Form" and contains the number of questions answered so far in this session.

var $dx_mnd_frm = 0;

 

And the reverse example (comments come after the code):

 

var $dx_mnd_frm = 0;

// $dx_mnd_frm stands for "Deluxe Mind Form" and contains the number of questions answered so far in this session.

 

In the case where comments are relatively short, but more important they describe a single line of code, they should appear on that same line of code. That way they won't be accidentally separated or moved from the associated code if that code is ever moved.

 

Lest you think this is a trivial example and not worth your time, let me once again rail against programmers who think certain things are too trivial or obvious to comment on. The question isn't "how trivial or obvious is it?" but rather "in how many ways could it be possibly misinterpreted?"

 

So, how about an example where comments appear above the code? I'm glad you asked:

 

var j = 0;

var $a = get_some_stuff();

for ($i=0; $i < $a.length(); $i++) {

      var $x = $a[$i].headline();

      if ($x == ‘test') {

            $j++;
      }

}

print "There are $j headlines that contain ‘test'\n";

 

This is a pretty trivial example. Even a newbie should take less than a minute to determine that this code counts the number of elements in $a whose headline is ‘test' and prints out a sentence to that effect.

 

Now, that's wonderful and great, but I have a few bones to pick:

 

 

Let's update our code:

 

// Count number of articles that contain a headline of "test":

var $headline_count = 0; // Contains number of headlines.

var $articles = get_some_stuff();

for ($i=0; $i < $articles.length(); $i++) {

      var $headline = $articles[$i].headline();

      if ($headline == ‘test') {

            $headline_count++;
      }

}

 

// Print number of headlines:

print "There are $headline_count headlines that contain ‘test'\n";

 

Note the following changes:

 

Compound statements

Why do programmers have an aversion to explaining their compound statements? It's so much easier to read an English description than to puzzle through nested parens sprinkled with Boolean conditions. For example:

 

if ((($action == $X_FILE_SAVE) && ('' == $FORM_CANVAS_DATA.$PARAM_FILENAME.value)) || ($action == $X_FILE_SAVE_AS)) {

      // do something

} else {

// do something else
}

 

Admittedly it's not that difficult to figure out what conditions are necessary for the first or second part of this block of code to be executed. But remember, this is just a simple example. In any case, why make it harder than necessary for your readers? Use whitespace (and by that I mean tabs!) to clear the air:

 

// If first time saving the file, or if explicitly asked for the Save As window:

if (

      (

      ($action == $X_FILE_SAVE)

            &&

       ('' == $FORM_CANVAS_DATA.$PARAM_FILENAME.value)

)

||

      ($action == $X_FILE_SAVE_AS)

) {

      // do something

} else {

// do something else
}

Exceptions

Earlier I mentioned that there are exceptions to the "always end your sentences with period (or other punctuation)" rule. Here they are:

 

When initializing a variable, it's smart to describe why you selected that particular value. For example:

 

var $got_it = false;

 

By now you should be able to tell me how to write meaningful comments for this code. It should look something like this:

 

var $got_test = false; // Whether we got the test article

 

Since this is just a made-up example that is lacking a larger context, the exact meaning of the comment is irrelevant for the moment. My point is that the comment accurately describes what the variable is storing and, because it's a Boolean, we can assume the only other valid value would be "true." Thus, at that point in the code we are assuming that we have not gotten the test article.

 

Every initialization of a value is an assumption that the given value is the default. If the subsequent code doesn't change the value, it will remain as assigned initially. For example:

 

// Determine whether we got the test article:

var $got_test = false; // assume

var $articles = get_some_stuff();

for ($i=0; $i < $articles.length(); $i++) {

      var $headline = $articles[$i].headline();

      if ($headline == ‘test') {

            // We got the test article:

            $got_test = true;

      break;

      }

}

 

The first comment ("Determine whether we got the test article") describes the subsequent block.

 

The second comment ("assume") describes why we are assigning the given value. In this case we are assuming failure (false), then testing for success (true). If we never find success, the default value (false) is used.

 

The third comment ("We got the test article") describes the subsequent block of code. Note that this comment is on a line by itself because the subsequent block of code contains two statements:

 

  1. assign a new value to $got_test
  2. exit from the loop

 

It would be misleading to put the comment at the end of the assignment statement because the comment applies to the entire block, not just that line.

Brackets/Braces/Parentheses

Another classic religious war is how to style your braces. The only two acceptable variations are:

 

loop(...) {

      do_something();

}

 

or:

 

loop(...)

{

      do_something();

}

 

There are compelling advantages and disadvantages to each so I can't say I have a strong opinion. However, here are some things you definitely should NOT do:

 

loop(...)

      do_something();

 

While this may be syntactically and logically correct, it completely screws you when you attempt to add another statement or block of code within the loop. For example, if you simply do this:

 

loop(...)

      do_something();

      do_something_else();

 

you will be screwing yourself because this is equivalent to:

 

loop(...) {

      do_something();

}

 

do_something_else();

 

Why? Because you were stupid enough to use indentation as your sole means of determining the scope in which a statement falls. In this case the compiler can't help you; you've effectively shot yourself in the foot.

 

So what's the right solution? Stick with the first or second version above.

Empty blocks

What's an empty block and when would you use it? First, let's step back and think about conditional statements. Most compound conditional statements look something like this:

 

if (condition1) {

      // We found several foobars in the database

      // do something...

} else if (condition2) {
      // The foobar was specified by the user

      // do something...

} else if (condition3) {
      // No foobars exist

      // do something...

}

 

This is great, but it leaves the reader hanging. What happens if none of those conditions are met? Shouldn't there be a final catch-all "else" statement? Obviously the code falls through, but the reader is still clueless when it comes to understanding under what conditions that would happen and, more important, what it means in a larger context, and even whether the programmer writing the code even considered the catch-all case. Here is my solution:

 

if (condition1) {

      // We found several foobars in the database

      // do something...

} else if (condition2) {
      // The foobar was specified by the user

      // do something...

} else if (condition3) {
      // No foobars exist

      // do something...

} else {

// Do nothing; we will email the user later.
}

 

Here it's perfectly clear that the programmer didn't forget the catch-all "else" statement at the end of the compound conditional statement; in fact, the programmer has made it clear that nothing should happen at this point, but it will happen later.

 

As a bonus, if additional code ever does have to be inserted into that final else block, it can be dropped in without changing the surrounding code. If you had to add the else block it would introduce another opportunity for a typo.

Implied Precedence

Hot-shot programmers often write things like this:

 

$x = 5 * $j + $y / 6;

 

Only the most novice programmer would get tripped up by rules of operator precedence, so on the surface it seems largely unnecessary to add parentheses. But this code leaves open a gaping question: did the programmer intend to let this statement be evaluated by natural operator precedence rules? The answer is important, because if the "+" changes to "*" (e.g. because the formula needs to be corrected) then the precedence of the entire formula changes.

 

I would write this code as follows:

 

$x = (5 * $j) + ($y / 6);

 

This makes it clear the order in which the programmer intended this statement to be evaluated. There's still one thing missing from this code fragment. That's right: comments! Again, this is left as an exercise for the reader.

Summary

Let's recap the general rules:

Conclusion

So what's the upshot? Write your code so it can be easily read and easily understood, not just by you but also by the poor schmuck who has to figure out your formatting style. Esoteric eccentricities are usually bad; conventional practices are good.

 

Much of your comments and code style should be geared towards hand-holding the next person who will be reading your code. No, you don't have to explain every last detail to them; that would be ridiculous. Rather, you want to insert the equivalent of post-it notes around your apartment for when your friend comes to visit for a few days and you're not there to show them what drawer the silverware is in, what pantry shelf the mayo is on, and where to put away the clean dishes. Rather than force them to figure it out, why not help them? After all, it's the hospitable thing to do.

 



Return to Kim Moser's Generic Home Page.
Copyright © 2024 by Kim Moser (email)
Last modified: Wed 09 January 2008 22:29:54