Language requires consensus.
Code is the primary means of communication in an XP team. A uniform
coding style greatly facilitates code comprehension, refactoring, pair
programming, collective ownership and testing. An XP team agrees on a
coding style before development starts.
Coding style is the first problem and XP team has to work out as a
group. The solution to the problem needs to be clear and unambiguous.
It's amazing how far this can be, and with some teams it's
impossible.
Coding style discussions are like a lightning rod. If there's a storm
brewing within the team, coding style will usually attract the first
lightning strike. If you can't reach agreement on a style, your team
is going to have difficulty building an application together.
Tension around style choices is natural in some ways. We are all
individuals. Programmers
take pride in their own work, further motivating their own success,
just as individual athletes value their own accomplishments.
However, not even the best
pitcher, quarterback, or forward in the world can win a game alone.
It takes a team and teamwork to win a game or write a large
application. Programming is a team sport.[2]
If you are a programmer, you may find yourself gritting your teeth at
my coding style. It wouldn't surprise me. Athletes and programmers
on different teams sometimes bristle at each other's style. However,
if we were to join the same team, we'd work out a compromise on coding
style to ensure the success of the project.
This chapter explains the need for a coding style in XP and discusses
how to go about creating one. I also explain and demonstrate the
coding style used in this book through a comparative example.
There's More Than One Way To Do It
Perl is a rich and complex language. If you ask a question about how
to do something in Perl on the Internet, you'll probably get several
different answers. And, the answers will often include the caveat:
TMTOWTDI. This is the acronym for Perl's motto: There's more than one
way to do it. The solution you choose will depend on the way you
program Perl.
So how do you program Perl? Larry Wall et al present a
coding style in the perlstyle man page.[3]
Yet there are myriad divergent styles on CPAN and in the Perl
literature. In the Perl community, diversity is seen as a strength,
and no one is going to tell you how to program Perl. Well, even if
they did, you wouldn't listen to them.
Give Me Consistency or Give Me Death
Your team still needs to pick a style. This isn't just XP dogma; it's human
nature. In the anthropology classic, The Silent
Language, Edward Hall wrote, "The drive toward
congruity would seem to be as strong a human need as the will to
physical survival." I conclude from this that if you don't pick
a Perl coding style, you'll die. If that isn't a good enough reason,
stop reading now.
Seriously, consistency is not an end in itself, it is the means to
facilitate testing, collective ownership, pair programming, and
refactoring. If you are developing a small application (a few
thousand lines of Perl), it's easy to keep the code consistent, or to
clean it up in an afternoon or two.
For large applications (tens or hundreds of thousands of lines spread
over hundreds or thousands of files), quick fixes are impossible. You
would never have the time to reformat the entire codebase.
The code changes too quickly. You don't get to ask everybody working
on a large application to stop while you fix some style issue.
However, for some necessary refactorings, such as, a change in a
widely used API, you may have no choice but to dive into tens or
possibly hundreds of files. You'll want this to happen as quickly as
possible so you'll automate the refactoring. With a consistent style,
you can probably do this fairly easily. If you have to account for
the many ways you can do things in Perl, you'll probably resort to
hand editing each file. Not only is this labor intensive, but it's
error prone, too.
Even when you aren't making global changes, you and your partner still
have to read unfamiliar code. A programming pair's ability to read
and to communicate through the code is affected directly by its
consistency. Communication is hard enough without you and your
partner having to wade through several code dialects. And, allowing
for style variations when writing code opens up too many unnecessary
thoughts. Do I adapt to my partner's style? Should we adopt the
style of the code we're editing now? Or perhaps, I should insist on
my style. After all, it has worked well for me over the years.
Starting out with an agreed upon style, frees our minds of such
distractions and allows us to focus on the important bit: solving
the customer's problem.
Team Colors
In Extreme Programming Explained, Kent Beck
wrote, "The standard must be adopted voluntarily by
the whole team." This may not be so simple. Establishing consensus
requires work on everybody's part. If your team has coded together
before, you'll probably have an easy time agreeing on a style.
For newly formed teams, use the style guide as a team building
exercise. Everyone should be encouraged to contribute. If a
particular point is too contentious, drop it until after the first
iteration or so. The goal is to get full consensus on the entire
guide. If someone is particularly inflexible during the discussions,
it's a warning sign that a team adjustment may be necessary.
Better sooner than later.
A style guide can be highly motivating, however. It's like your team's colors.
It's something relatively insignificant which provides significant
cohesion. If even one team member is coerced into agreement, the team
isn't sticking together, and the rift may grow into a chasm. When
everybody voluntarily accepts the style choices, you are functioning
as a team, and you are ready to code.
An Example
Rather than discuss style in the abstract, I'd like to demonstrate my
opinion of good style through a comparative example. Here's an
excerpt from the popular Test package on CPAN:
[4]
package Test;
use strict;
sub plan {
croak "Test::plan(%args): odd number of arguments" if @_ & 1;
croak "Test::plan(): should not be called more than once" if $planned;
local($\, $,); # guard against -l and other things that screw with
# print
_reset_globals();
_read_program( (caller)[1] );
my $max=0;
for (my $x=0; $x < @_; $x+=2) {
my ($k,$v) = @_[$x,$x+1];
if ($k =~ /^test(s)?$/) { $max = $v; }
elsif ($k eq 'todo' or
$k eq 'failok') { for (@$v) { $todo{$_}=1; }; }
elsif ($k eq 'onfail') {
ref $v eq 'CODE' or croak "Test::plan(onfail => $v): must be CODE";
$ONFAIL = $v;
}
else { carp "Test::plan(): skipping unrecognized directive '$k'" }
}
my @todo = sort { $a <=> $b } keys %todo;
if (@todo) {
print $TESTOUT "1..$max todo ".join(' ', @todo).";\n";
} else {
print $TESTOUT "1..$max\n";
}
++$planned;
print $TESTOUT "# Running under perl version $] for $^O",
(chr(65) eq 'A') ? "\n" : " in a non-ASCII world\n";
print $TESTOUT "# Win32::BuildNumber ", &Win32::BuildNumber(), "\n"
if defined(&Win32::BuildNumber) and defined &Win32::BuildNumber();
print $TESTOUT "# MacPerl verison $MacPerl::Version\n"
if defined $MacPerl::Version;
printf $TESTOUT
"# Current time local: %s\n# Current time GMT: %s\n",
scalar( gmtime($^T)), scalar(localtime($^T));
print $TESTOUT "# Using Test.pm version $VERSION\n";
# Retval never used:
return undef;
}
The routine plan is used to set up a unit
test, for example:
use Test;
use strict;
BEGIN {
plan(tests => 2);
}
ok(1 + 1 == 2);
ok(2 * 2 == 4);
This unit test calls plan to declare that there
are two test cases. The ok function checks the
result after each case executes, and prints success or failure.
Here's what I think is good about this implementation of
plan. The routine:
is well-used and mature.
We can be relatively sure it addresses the needs of its users and is
relatively stable.
has had several authors. The more eyes on a problem, the better the
solution.
addresses type safety. Test has a broad user
base, so it makes sense to put extra effort into argument validation.
fails fast, that is, if
plan encounters an unexpected argument or state,
it terminates execution (calls croak) in all but
one case. The sooner a programming error is detected, the less damage
the errant program can do.
is backwards compatible. The parameters test and
failok are deprecated, that is, they shouldn't
be used in new tests, but existing tests that use them still work.
comes with a thorough unit test suite that describes expected
behavior and enables refactoring.
uses fine-granularity, feature-driven portability. It uses narrow feature
checks
(for example, defined $MacPerl::Version
and chr(65) eq 'A')) instead of broad general checks, such
as, checking the operating system (for example,
$^0 eq 'MacOS'). Powerful and easy to
use introspection is one of the reasons Perl and CPAN packages are
usable on so many platforms.
You Say, "if else", And I Say, "? :"
While indentation, lining up braces, or other formatting is important
to ease automated refactoring, you won't find much discussion about
them in this book. However, the more strictly you follow your coding
style, the more easily you can automate refactorings. For example, if
function arguments are always surrounded by parentheses,
you can rename functions or reorder parameters using a simple editor
macro.[5]
And speaking of editors, most programmers' editors have style
formatters. If yours doesn't, you can always use
perltidy, a very flexible Perl code
reformatter.[6]
Automatic formatters improve your team's efficiency and adherence to
your style guidelines. That's all I'm going to say about formatters
and editors. The only thing worse than coding style discussions are
editor wars.
Like editors, style choice is defined by your experience and personal
taste, and the details matter. I follow the
perlstyle man page for the most part, but I
disagree with some of their parentheses and alignment choices. You
may not like my style, so go ahead and indent by three spaces if you
like.[7]
Once And Only Once
Parentheses and indentation aside, the important bit of my style is
that I refactor ruthlessly. I don't like redundancy, especially in
the form of single use temporary variables and repetitive calls. In
XP, we call this the once and only once (OAOO) rule, and it's what you
do when you refactor.
For new code, I try to do the simplest thing that could possibly work
(DTSTTCPW). This is XP's most important coding guideline. First I
get it working simply. I might have to copy and paste some code or
create a temporary variable. Once it passes the tests, I look at the
design and simplify it so that each concept is expressed once and only
once. There are some time and planning trade offs here, and the
Refactoring chapter discusses them. The relevant point is
that once and only once is an overarching style guidline, and one that
I value highly. When concepts are expressed once and only once, the
code is more robust, more easily extensible, and performs better than
code with needless duplication.
Refactored Example
The code that follows has been changed to demonstrate once and only
once and other style choices I value. The formatting matches the style
used in this book. More importantly, the rewritten code is more
cohesive. Not only should each concept be expressed only once, but
each routine should implement only one concept. Strong
cohesion allows people to comprehend and to abstract code
entities (routines and packages) easily. By isolating and naming each
behavior, we make it easy to understand what each piece of the
software puzzle does.
The four new routines are also loosely coupled.
This means their inputs and outputs are few and well-defined. Loose
coupling is important when isolating behavior, because it is difficult
to understand and to test a routine with many inputs and outputs. In
effect, the routine's identity is a combination of its name and its
inputs and outputs, which is commonly know as its
signature. We remember shorter signatures and
the behavior they identify more easily than longer ones.
That's enough theory for now, here's the my version of
plan:
package Test;
use strict;
sub plan {
my($args) = {@_};
Carp::croak('should not be called more than once')
if $_TODO;
_reset_globals();
_read_program((caller)[1]);
_plan_print(_plan_args($args));
return;
}
sub _plan_args {
my($args) = @_;
$_ONFAIL = _plan_arg_assert($args, ['onfail'], 'CODE');
my($max) = _plan_arg_assert($args, ['tests', 'test'], 'integer') || 0;
# $_TODO is the initialization sentinel, so it's the last value set
$_TODO = {map {$_ => 1}
@{_plan_arg_assert($args, ['todo', 'failok'], 'ARRAY') || []}};
Carp::carp("@{[sort(keys(%$args))]}: skipping unrecognized or",
' deprecated directive(s)')
if %$args;
return $max;
}
sub _plan_arg_assert {
my($args, $names, $type) = @_;
foreach my $n (@$names) {
next unless exists($args->{$n});
Carp::croak("$n: parameter must not be undef")
unless defined($args->{$n});
Carp::croak("$args->{$n}: $n must be $type")
unless $type eq 'integer' ? $args->{$n} =~ /^\d+$/
: ref($args->{$n}) eq $type;
return delete($args->{$n})
}
return undef;
}
sub _plan_print {
my($max) = @_;
_print(join("\n# ",
"1..$max"
. (%$_TODO ne '' && " todo @{[sort {$a <=> $b} keys(%$_TODO)]};"),
"Running under perl version $] for $^O"
. (chr(65) ne 'A' && ' in a non-ASCII world'),
defined(&Win32::BuildNumber) && defined(Win32::BuildNumber())
? 'Win32::BuildNumber ' . Win32::BuildNumber() : (),
defined($MacPerl::Version)
? "MacPerl version $MacPerl::Version" : (),
'Current time local: ' . localtime($^T),
'Current time GMT: ' . gmtime($^T),
"Using Test.pm version $VERSION\n"));
return;
}
sub _print {
local($\, $,);
return print($TESTOUT @_);
}
Change Log
The following is a detailed list of changes, and why I made them.
Most of the changes are refactorings, that is, they do not modify
the way plan works from the caller's
perspective. A few changes improve the behavior ever so slightly, and
are noted below. This list is ordered from most important to trivial:
The four main behaviors: control flow, validating arguments, type checking,
and printing are contained in separate routines. Each routine is
responsible for one and only one behavior.
The localtime and gmtime calls
are now in the correct order. This defect in the original version
only became apparent to me when I separated the two output lines.
Argument type validation is consistent, because it has been isolated
into a single routine (_plan_arg_assert) that is used
for all three parameters. Several new cases are caught. For example,
passing undef to tests or
passing both tests and test
(deprecated form) is not allowed.
Carp::croak unrecognized directive warning is printed once
instead of a warning per unrecognized directive.
The check for unrecognized directives still does not fail
fast (croak or die). I
would have liked to correct this, because passing an invalid
directives to plan probably indicates a broken
test. However,
the broad user base of Test makes this change
infeasible. Somebody may be depending on the behavior that this is
only a warning.
Two temporary variables (@todo and
$x) were eliminated by using a functional
programming style. By avoiding temporary variables, we simplify
algorithms and eliminate ordering dependencies. See
the It's a SMOP chapter for a longer
example of functional programming.
$planned was eliminated after
$_TODO was converted to a reference.
$planned is known as a
denormalization, because it can be computed
from another value ($_TODO in this case).
Normal form is when data structures and
databases store the sources of all information once and only once.
_plan_print writes a single string. The seven
calls print were unnecessary duplication.
I often use logical
operators instead of imperative statements to avoid the use of
temporary variables, which are another form of duplication
(denormalization).
The return value from plan is better represented
as an empty return, because it handles list and
scalar return contexts correctly. This is a subtle point about
return, and it actually involves an interface
change. The following use assigns an empty list:
my(@result) = Test::plan(tests => 1);
In the old version, @result would contain the
list (undef), that is, a list with a single element
containing the value undef.
The check for an odd number of arguments is unnecessary, because the
assignment to a hash will yield a warning and the argument parsing
is more rigorous (no argument may be undef, for example).
_print encapsulates the output function that is
used throughout Test. The concept that the output
is directed to $TESTOUT is only expressed
once.
The global variables are named consistently
($_ONFAIL and $_TODO). I name
global variables in uppercase. I use a leading underscore to
identify variables and routines which are to be used internally to the
package only.
$TESTOUT was not renamed, because it is exported
from the package Test.
In general, variables should never be exported, but this would be an
interface change, not a refactoring.
I fully qualify all names defined outside a package
(Carp::carp and
Carp::croak). This helps the reader to know what
is defined locally as well as enabling him to find the
implementation of or documentation for external functions quickly.
I apply this guideline to perl modules. In specialized Perl scripts,
such as, templates and tests, I prefer the brevity of the unqualified
form. For example, in the unit test example above, I used
ok, not Test::ok.
carp and croak print the
file and line number for you, so including
Test::plan
in the error string is unnecessarily redundant.
The spelling error (verison) in the
$MacPerl::Version output string was corrected.
The two calls to sprintf and
scalar are unnecessary. The concatenation
operator (dot) is sufficient, more succinct, and used consistently.
The old style call syntax
(&Win32::BuildNumber())
was eliminated, because it was not used in all places
(_reset_globals()).
The comment # Retval never used: was removed,
because it is superfluous, and it indicates an unprovable assertion.
You can't know that the return value won't be used.
The comment # guard against -l and... was removed,
because the context of _print is enough to
explain why the local call is needed.[8]
Even if
you don't know what $, and $\
are, you know they are relevent only to the call to
print, since that's the only thing that it could
possibly affect.
Refactoring
Now kids, don't try this at work. Refactorings and small corrections
are not an end to themselves. The do not add business value--unless
you are writing your coding style guideline as is the case here.
Refactorings need to be related to the task at hand. For
example, if there I was given a story to fix the minor defects in
plan or to add automatic test case counting, then
I would have refactored the code to allow for those changes, and
possibly a bit more. However, random and extensive refactoring
as I've done here is worthless. The original version works just fine,
and all the corrections are minor. If you spend your days refactoring
and making small corrections without explicit customer approval,
you'll probably lose your job.
The new plan is also not just a refactoring.
When an interface changes, it's only a refactoring if all its uses are
changed simultaneously. For public APIs like this one, that's
impossible to do. In this particular case, I took a chance that the
return value of plan was not used in this rather
obscure way, that is, expecting a single list containing
undef.
Input Validation
Perl is a dynamically typed language. The routine
plan contains a set of type
assertions, and the refactored version expanded on them.
Is this the best way to write dynamically typed code?
but
It depends. In this case, explicit type checking is possibly
overkill. For example, the $_TODO and
$_ONFAIL are dereferenced elsewhere in the
package. Dereferencing a non-reference terminates execution in Perl,
so the error will be caught anyway. Since Test
is only used in test programs, it's probably sufficient to catch an error at
any point.
On the other hand, Test is a very public API,
which means it has a broad and unknown user base. Explicit type
checking almost always yields more easily understood error messages
than implicit error checks. This helps users debug incorrect
parameters. plan is only called once during a
test execution so the performance impact of the additional checking is
insignificant.
Here are some guidelines we use to determine when to add type
assertions:
Always validate data from untrusted sources, for example, users or
third party services. It's important to give informative error
messages to end users. This type of validation occurs at the
outermost level of the system, where meaningful error messages can be
returned with the appropriate context.
Add type assertions to low level modules that define the data types,
and leave them out at the middle levels where they would be redundant.
There may be a performance trade off here. In general, the more
public the API, the more important validation is. For example,
plan defines and asserts that the test count is
positive integer.
Assert what is likely to be wrong.
Write deviance tests, that is, tests which result in exceptions or
type validation errors. Add assertions if the tests don't pass. The
appropriateness of a particular type assertion is often hard to
assess. Don't sweat it. You'll learn what's appropriate as your system
evolves.
Don't expect to get it right, and think about the consequences if you
get it wrong. The more that's at stake, the more important assertions
are.[9]
Writing robust code is hard. If you add too many assertions, their
sheer volume will introduce more defects than they were intended to
prevent. Add too few assertions, and one day you'll find a cracker
who has compromised your system, or worse. Expect the code to evolve
as it gets used.
You'd Rather Die
Nothing is more boring than reading someone's opinion about coding
style. Rather than kill off my readership, I'll stop here. When you
get up to stretch your legs, I'd like you to walk away with five
points:
An XP team needs a consistent coding style.
It doesn't matter what the style is, as long as everyone agrees to
adhere to it.
Take refactoring into consideration when determining your coding
style.
Do the simplest thing that could possibly work when writing new code.
Simplify your design so that concepts are expressed once and only once.
Footnotes
Open Sources: Voices from the Open Source
Revolution, DiBona et al, 1999, O'Reilly, p. 127.
Available online at
http://www.oreilly.com/catalog/opensources/book/larry.html
And more
generally, "Business is a team sport." Rich
Kid, Smart Kid, Kiyosaki et al, Warner Books, 2001,
p. 224-225.
http://www.perl.com/doc/manual/html/pod/perlstyle.html
http://search.cpan.org/src/SBURKE/Test-1.22/lib/Test.pm
You can download some refactoring functions for Emacs from
http://www.bivio.biz/f/bOP/b-perl.el
Available for free from
http://perltidy.sourceforge.net
Even though it's a sin.
In XP, "we comment methods only after doing everything possible to make
the method not need a comment." See
http://xp.c2.com/ExtremeDocuments.html
for a document about documentation by XP's founders.
Thanks to Ged Haywood for reminding me of this one.
|