Good coding practices

Prepared by:

Darek Łuczyński

Introduction to Clean Code

Introduction

Author: Based on the book by Robert C. Martin (Uncle Bob)

Title: Clean Code - A handbook of Agile Software Craftmanship

book

Definition of programming and code

Specifying requirements in such detail that a machine can execute them is programming. Such a specification is code.

Ref.: Clean Code, R.C. Martin

Dirty code characteristics

  • dirty code:

(computing, derogatory) Software code that has had many editors with conflicting styles, making it nearly impossible to maintain. That software has dirty code and we should not use it.

Ref.: https://www.yourdictionary.com/dirty-code

When to write dirty code

There are a few occasions where its okay to write dirty code:

  1. When you’re stuck
  2. When you want to write good code
  3. When you want to make things quickly

Ref.: https://zellwk.com/blog/its-okay-to-write-dirty-code/

When to write dirty code

[…]programming is a craft more than it is a science. To write clean code, you must first write dirty code and then clean it. […] Most freshman programmers don’t follow this advice. They believe that the primary goal is to get the program working. Once it’s “working,” they move on to the next task, leaving the “working” program in whatever state they finally got it to “work.” Most seasoned programmers know that this is professional suicide.

Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

Why keep the code clean

Code is read much more often than it is written. Code should always be written in a way that promotes readability.”

Guido van Rossum, founder of Python, Python Enhancement Proposal (PEP 8)

# see ZEN of Python in python console
import this

“Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.”

Robert C. Martin, Clean Code: A Handbook of Agile Software Craftsmanship

Why keep the code clean

pie_chart

Ref.: https://dmitripavlutin.com/coding-like-shakespeare-practical-function-naming-conventions/

Why keep the code clean

The Cost of Owning a Mess is that:

any change is not trivial.

We’ve all said we’d go back and clean it up later. Those days we didn’t know LeBlanc’s law:

Later equals never.

Ref.: https://www.quora.com/What-resources-could-I-read-about-Leblancs-law

Clean code characteristics

- elegant,

- efficient,

- complete error handling,

- does one thing well,

- simple and direct,

- well-written prose,

- written by someone who cares,

- reduced duplications (copy & paste),

- high expressiveness (including meaningful names),

- early building of simple abstractions, abstract method or class,

- test driven development (unit tests related),

Code quality measurement

It’s been measured by WTF quantity per minute:

code_qc

The Boy Scout Rule

The Boy Scouts of America have a simple rule that we can apply to our profession:

Leave the campground cleaner than you found it.

The cleanup doesn’t have to be something big. Change one variable name for the better, break up one function that’s a little too large, eliminate one small bit of duplication.

Names

Naming Conventions

Raw camelCase PascalCase kebab-case snake_case
fruits in basket fruitsInBasket FruitsInBasket fruits-in-basket fruits_in_basket
has error hasError HasError has-error has_error
is visible isVisible IsVisible is-visible is_visible

N6: Avoid Encodings

Names should not be encoded with type or scope information. Prefixes such as ‘m_’, ‘c’, ‘s’ or ‘i’ are useless in today’s environments.

  • Hungarian Notation (HN)

Majority of languages f.ex. Java, C# including CAPL are strongly typed, and editing environments can detect a type error long before you can run a compile. So nowadays HN and other forms of type encoding are simply handicaps.

Encodings are also a possibility that the encoding system will mislead the reader.

PhoneNumber phoneString;
// name not changed when type changed!

G25: Replace Magic Numbers with Named Constants

This is probably one of the oldest rules in software development. In general it is a bad idea to have raw numbers in your code. Shall be hidden behind well-named constants.

Some constants are so easy to recognize that they don’t always need a named constant to hide behind so long as they are used in conjunction with very self-explanatory code.

double milesWalked = feetWalked/5280.0; //well known, easy to recognize... maybe for an american
int dailyPay = hourlyRate * 8;
double circumference = radius * Math.PI * 2;

G25: Replace Magic Numbers with Named Constants

The term “Magic Number” does not apply only to numbers. It applies to any token that has a value that is not self-describing.

assertEquals(7777, Employee.find(“John Doe”).employeeNumber());

There are two magic numbers in this assertion. The first is obviously “7777”. The second magic number is “John Doe”.

It also turns out that “John Doe” represents hourly paid, test employee in that test database. So this test should really read:

assertEquals(HOURLY_EMPLOYEE_ID, Employee.find(HOURLY_EMPLOYEE_NAME).employeeNumber());

N1: Choose Descriptive Names

  • Class Names

Classes, structures and objects should have noun or noun phrase names like: Customer, WikiPage, Account, and AddressParser. A class name should not be a verb.

  • Method Names

Methods should have verb or verb phrase names like: postPayment, deletePage, or save. Accessors, mutators, predicates should be prefixed with ‘get, set, is‘.

Choosing good names for one argument function can go together with the arguments. Function and argument should form a very nice verb/noun pair.

//is very evocative. Whatever this “name” thing is, it is being “written.”
write(name) 
//even better name might be below, which tells us that the “name” thing is a “field.”
writeField(name)

N1: Choose Descriptive Names

Names in software are 90% of what make software readable. You need to take the time to choose them wisely and keep them relevant.

public int x() {
    int q = 0;
    int z = 0;
    for (int kk = 0; kk < 10; kk++) {
        if (l[z] == 10)
        {
            q += 10 + (1[z + 1] + 1[z + 2]);
            z += 1;
        }
        else if (l[z] + l[z + 1] == 10)
        {
            q += 10 + 1 [z + 2];
            z += 2;
        } else {
            q += l[z] + l[z + 1];
            z += 2;
        }
    }
    return q
}

N1: Choose Descriptive Names

  • Use Intention-Revealing Names

The name of a variable, function, or class, should tell you why it exists, what it does, and how it is used. So take care with your names.

public int score() {
    int score = 0;
    int frame = 0;
    for (int frameNumber = 0; frameNumber < 10; frameNumber++) {
        if (isStrike(frame)) {
            score += 10 + nextTwoBallsForStrikeiframe);
            frame += 1;
        } else if (isSpareiframe)) {
            score += 10 + nextBallForSpareiframe);
            frame += 2;
        } else {
            score += twoBallsInFrame(frame);
            frame += 2;
        }
    }    
    return score;
}

N3: Use Standard Nomenclature Where Possible

Names are easier to understand if they are based on existing convention or usage. In Java, for example, functions that convert objects to string representations are often named:

toString();

Teams will often invent their own standard system of names for a particular project. Your code should use the terms from this language extensively.

More you can use names that are overloaded with special meanings that are relevant to your project, the easier it will be for readers to know what your code is talking about.

N4: Unambiguous Names

Choose names that make the workings of a function or variable unambiguous. Names may seem long, however it’s explanatory value outweighs the length cost.

  • Avoid Disinformation

    • Do not refer to specific phrases, unless you know it’s truly represents it.
      // The word list means something specific to programmers.
      // account list is actually a dictionary, but by the name indicates on the list
      var accountList = new Dictionary<string, int>();
      //  account list is truely a list
      var accountList = new List<int>();
    • Beware of using names which vary in small ways.
      // It is not ewasy to distinguish both variables
      var yzControllerForEfficientHandlingOfStrings = new YZControllerForEfficientHandlingOfStrings();
      var xyzControllerForEfficientStorageOfStrings = new XYZControllerForEfficientStorageOfStrings();

N4: Unambiguous Names

Example of disinformative names would be the use of lower-case l or uppercase O as variable names, especially in combination. The problem is that they look almost entirely like the constants one and zero.

int a = l;
if ( O == l )
    a = O1;
else
    l = 01; 

N4: Unambiguous Names

  • Use Pronounceable Names. This matters because programming is a social activity. Hungarian notation and prefixes make names unpronouceable.
// bad naming
int XPstn = 1;
// better naming
int xPosition = 1; 
  • Avoid Mental Mapping. This is a problem with single-letter variable names.
// good loop counter
customers[i];
customers[j];
customers[k];
// bad loop counter
// Seems like number '1'
customers[l];

N4: Unambiguous Name

  • Don’t Be Cute

Choose clarity over entertainment value.

// bad naming
void hollyHandGranade();
// good naming
void deleteAllItems();
  • Pick One Word per Concept Pick one word for one abstract concept and stick with it. For instance, it’s confusing to have fetch, retrieve, and get as equivalent methods of different classes.

N4: Unambiguous Name

If names must be different, then they should also mean something different.

  • Make Meaningful Distinctions

    • Number-series naming (a1, a2, .. aN)
      It is the opposite of intentional naming. Such names are noninformative.
    • Noise words like Info, Data, the, a, an, variable Noise words are redundant. The word variable should never appear in a variable name.

N5: Use Long Names

  • Use Long Names for Long Scopes

The length of a name should be related to the length of the scope. You can use very short variable names for tiny scopes, but for big scopes you should use longer names.

def make_md_from_from_sfa_tokens(tokens) -> str:
md = '' # short scope
for token in tokens:
    local_md = make_md_description_from_sfa_token(token)
    md += local_md + '\r\n'
return md
  • Use Searchable Names

Single-letter names and numeric constants have a particular problem in that they are not easy to locate across a body of text.

Code excercises

Code excercises

Functions

G30: Functions Should Do One Thing

Single responsibility rule

  • FUNCTIONS SHOULD DO ONE THING
  • THEY SHOULD DO IT WELL
  • THEY SHOULD DO IT ONLY
  • Sections within Functions
// Bad example, fuction pay() does 3 operations
public void pay() {
    for (Employee e : employees) { // review data of each emploee
        if (e.isPayday()) { // checks if emploee is allowd to pay
            Money pay = e.calculatePay(); // then finally calculates the paymen and pays
            e.deliverPay(pay);
        }
    }
}

G30: Functions Should Do One Thing

Functions that do one thing cannot be reasonably divided into sections.

// Better example
public void pay() {
    for (Employee e : employees){
        paylfNecessary(e);
    }
}
private void payIfNecessary(Employee e) {
    if (e.isPaydayi)){
        calculateAndDeliverPay(e);
    }   
}
private void calculateAndDeliverPay(Employee e) {
    Money pay = e .calculatePayi);
    e.deliverPay(pay);
}

Small Functions

Functions should not be 100 lines long. Functions should hardly ever be 20 lines long.

G20: Function Names Should Say What They Do

Would you expect this to add five days to the date? Or is it weeks, or hours? You can’t tell from the call what the function does.

// bad example
// Does it add 5 days, weeks or hours to the date?
Date newDate = date.add(5);

If the function adds five days to the date and changes the date:

//better solution, if it incraeses by 5 days the better solution coold be
date.increaseByDays(5);

G5: Duplication

This is one of the most important rules in this book, it’s been called DRY principle (Don’t Repeat Yourself).

OO itself is a strategy for organizing modules and eliminating duplication.
Find and eliminate duplication wherever you can.

More subtle duplications are the modules that have similar algorithms, but that don’t share similar lines of code.

F1: Too Many Arguments

No argument is best, followed by one, two, and three. More than three is very questionable and should be avoided. They take a lot of conceptual power. Arguments are even harder from a testing point of view, because demands to write many test cases to ensure that all the various combinations of arguments work properly. If there are no arguments, this is trivial.

When a function seems to need more than two or three arguments, it is likely that some of those arguments ought to be wrapped into a class of their own.

 Circle makeCircle(double x, double y, double radius);
Circle makeCircle(Point center, double radius);

Reducing the number of arguments by creating objects out of them may seem like cheating, but it’s not. When groups of variables are passed together, they are likely part of a concept.

G10: Vertical Separation

  • Variables and function should be defined close to where they are used. Local variables should be declared just above their first usage and should have a small vertical scope.

  • Dependent functions should be defined just below their first usage. If one function calls another, they should be vertically close, and the caller should be above the callee. This gives the program a naturalflow.

G10: Vertical Separation

  • Reading Code from Top to Bottom: The Stepdown Rule

The code should be read like a top-down, every function to be followed by those at the next level of abstraction so that we can read the program, descending one level of abstraction at a time as we read down the list of functions. I call this The Stepdown Rule.

F4: Dead Function / G9: Dead Code

Dead code is code that isn’t executed. The problem with dead code is that after awhile it starts to smell. This is because dead code is not completely updated when designs change.

It still compiles, but it does not follow newer conventions or rules. It was written at a time when the system was different. When you find dead code, delete it from the system. Do not afraid, your source code control system still remembers it.

G29: Avoid Negative Conditionals

Negatives are harder to understand than positives. So, when possible, conditionals should be expressed as positives.

// Is worse, negative conditional
if (!buffer.shouldNotCompact())
// Is better, positive conditional
if (buffer.shouldCompact())

G35: Keep Configurable Data at High Levels

Constant such as a default or configuration value that is known and expected at a high level of abstraction, do not bury it in a low-level function. Expose it as an argument to that low-level function called from the high-level function.

Comments

C3: Redundant Comment

A comment is redundant if it describes something that adequately describes itself. Comments should say things that the code cannot say for itself.

i++;// increment i`

C3: Redundant Comment

  • Journal Comments

Sometimes people add a comment to the start of a module every time they edit it. These comments accumulate as a kind of journal, or log, of every change that has ever been made. Nowadays, we have source code control systems, these long journals make harder to read the code.

/*
* Changes (from 11-Oct-2001)
* --------------------------
* 11-Oct-2001 : Re-organised the class and moved it to new package
* com.jrefinery.date (DG);
* 05-Nov-2001 : Added a getDescription() method, and eliminated NotableDate
* class (DG);
* 12-Nov-2001 : IBD requires setDescription() method, now that NotableDate
* class is gone (DG); Changed getPreviousDayOfWeek(),
* getFollowingDayOfWeek() and getNearestDayOfWeek() to correct
* bugs (DG);
* 05-Dec-2001 : Fixed bug in SpreadsheetDate class (DG);
* 29-May-2002 : Moved the month constants into a separate interface
* (MonthConstants) (DG);
* 27-Aug-2002 : Fixed bug in addMonths() method, thanks to Nlevka Petr (DG);
* 03-Oct-2002 : Fixed errors reported by Checkstyle (DG);
* 13-Mar-2003 : Implemented Serializable (DG);
* 29-May-2003 : Fixed bug in addMonths method (DG);
* 04-Sep-2003 : Implemented Comparable. Updated the isInRange javadocs (DG);
* 05-Jan-2005 : Fixed bug in addYears() method (1096282) (DG);
*/

C3: Redundant Comment

  • Noise Comments

Sometimes you see comments that are nothing but noise. They restate the obvious and provide no new information.

/**
* Default constructor.
*/
protected AnnualDateRule() {
}

/** The day of the month. */
private int dayOfMonth;

/**
* Returns the day of the month.
*
* @return the day of the month.
*/
//Give me a break!
public int getDayOfMonth() {
return dayOfMonth;
}

These comments are so noisy that we learn to ignore them, our eyes tend to skip them over.

C3: Redundant Comment

  • Don’t Use a Comment When You Can Use a Function or a Variable

Consider the following stretch of code:

// does the module from the global list /mod/ depend on the
// subsystem we are part of
if (smodule
        .getDependSubsystems()
        .contains(subSysMod.getSubSystem())
    ){}

This could be rephrased without the comment as:

ArrayList moduleDependees = smodule.getDependSubsystems();
String ourSubSystem = subSysMod.getSubSystem();
if (moduleDependees
        .contains(ourSubSystem)
    ){}

Sometimes cleanerCode != smarterCode

C3: Redundant Comment

  • Position Markers

Sometimes programmers like to mark a particular position in a source file.

// Actions //////////////////////////////////

Right after eliminating long tail of slashes, sometimes it makes sense to use such marker to gather certain functions together. But in general they are clutter that should be eliminated. Use them occasionally, and only when the benefit is significant. If too many, our eyes learn to ignore them.

C3: Redundant Comment

  • Closing Brace Comments

Although this might make sense for long functions with deeply nested structures, it serves only to clutter the kind of small and encapsulated functions that we prefer. So if you find yourself wanting to mark your closing braces, try to shorten your functions instead.

while ((line = in.readLine()) != null) {
    lineCount++;
    charCount += line.length();
    String words[] = line.split("\\W");
    wordCount += words.length;
} //while

C3: Redundant Comment

  • Attributions and Bylines
/* Added by Rick */

Source code control systems are very good at remembering who added what, when. There is no need to pollute the code with little bylines.

C3: Redundant Comment

  • Too Much Information

The comment below was extracted from a module designed to test that a function could encode and decode base64 coding.

It is good enough to have RFC number, someone reading this code has no need for the arcane information contained in the comment.

/*
RFC 2045 - Multipurpose Internet Mail Extensions (MIME)
Part One: Format of Internet Message Bodies
section 6.8. Base64 Content-Transfer-Encoding
The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters. Proceeding from left to right, a
24-bit input group is formed by concatenating 3 8-bit input groups.
These 24 bits are then treated as 4 concatenated 6-bit groups, each
of which is translated into a single digit in the base64 alphabet.
When encoding a bit stream via the base64 encoding, the bit stream
must be presumed to be ordered with the most-significant-bit first.
That is, the first bit in the stream will be the high-order bit in
the first 8-bit byte, and the eighth bit will be the low-order bit in
the first 8-bit byte, and so on.
*/

C3: Redundant Comment

  • Mandated Comments

It is just plain silly to have a rule that says that every function must have a doc, or every variable must have a comment. Comments like this just clutter up the code, propagate lies.

/**
*
* @param title The title of the CD
* @param author The author of the CD
* @param tracks The number of tracks on the CD
* @param durationInMinutes The duration of the CD in minutes
*/
public void addCD(String title, String author, int tracks, int durationInMinutes) {
    CD cd = new CD();
    cd.title = title;
    cd.author = author;
    cd.tracks = tracks;
    cd.duration = duration;
    cdList.add(cd);
}

C4: Poorly Written Comment

  • Inobvious Connection

The connection between a comment and the code it describes should be obvious. If you are going to the trouble to write a comment, then at least you’d like the reader to be able to look at the comment and the code and understand what the comment is talking about.

/*
* start with an array that is big enough to hold all the pixels
* (plus filter bytes), and an extra 200 bytes for header info
*/
this.pngBytes = new byte[((this.width + 1) * this.height * 3) + 200];

What is filter, what is its the relation to ‘+ 1‘ or ‘* 3‘ or both of them? Is the pixel a ‘byte‘? Why header info takes ‘200 bytes‘?

C5: Commented-Out Code

  • Commented-Out Code

Don’t do this! Others who see that commented-out code won’t have the courage to delete it. They’ll think it is there for a reason and is too important to delete. That code sits there and rots, getting less and less relevant with every passing day. It calls functions that no longer exist. It uses variables whose names have changed.

InputStreamResponse response = new InputStreamResponse();
response.setBody(formatter.getResultStream(), formatter.getByteCount());
// InputStream resultsStream = formatter.getResultStream();
// StreamReader reader = new StreamReader(resultsStream);
// response.setContent(reader.read(formatter.getByteCount()));

Source code control systems will remember the code for us. We don’t have to comment it out any more. Just delete the code.

Code excercises

Foramtting

Blocks and Indenting

public class FitNesseServer implements SocketServer { private FitNesseContext
context; public FitNesseServerCFitNesseContext context) { this.context =
context; } public void serveCSocket s) { serveCs, 10000); } public void
serveCSocket s, long requestTimeout) { try { FitNesseExpediter sender = new
FitNesseExpediterCs, context); sender.setRequestParsingTimeLimitCrequestTimeout); sender.startC); }
catchCException e) { e.printStackTraceC); } } }
  • Code formatting is about communication

A source file is a hierarchy rather like an outline. To make this hierarchy of scopes visible, we indent the lines of source code in proportion to their position in the hiearchy. This allows them to quickly hop over scopes. Without indentation, programs would be virtually unreadable by humans. However it is good enough css/js code for the minification!

  • The indent level of a function should not be greater than one or two, which makes the functions easier to read and understand.

Vertical Formatting

Comparing 7 different projects, files that are typically ~40 lines long, with an upper limit of ~500. Small files are usually easier to understand than large files are. Desirable is building an app having from 200 to 500 lines per file, however, this should not be a hard rule.

vertical_box_plot

Figure: File length distributions LOG scale (box height = sigma)

Vertical Openness Between Concepts

Nearly all code is read left to right and top to bottom. Each line represents an expression or a clause, and each group of lines represents a complete thought. Those thoughts should be separated from each other with blank lines.

def is_str_instance(v):
    return isinstance(v, str)


def hex_str_to_int(input: str) -> int:
    return int(input, 16)


def int_to_hex(input: int) -> hex:
    return hex(int)

Horizontal Formatting

Below figure suggests that we should strive to keep lines short. The old Hollerith limit was 80, currently lines edging out to 100 or even 120 are fine. Beyond that is probably just careless. But monitors are too wide for that nowadays, and programmers can shrink the font so small that they can get 200 characters across the screen. Don’t do that.

line_width

Figure. Java line width distribution

Horizontal Openness and Density

We use horizontal white space to associate things that are strongly related and disassociate things that are more weakly related.

private void measureLine(String line) {
    lineCount++;
    int lineSize = line.length();
    totalChars += lineSize;
    lineWidthHistogram.addLine(lineSize, lineCount);
    recordWidestLine(lineSize);
}
  • Spaces between assignment operators
  • No spaces between function name and opening parenthesis

Horizontal Alignment

Horizontal alignment was common for assembly language programmers. That this kind of alignment is not useful. The alignment seems to emphasize the wrong things and leads my eye away from the true intent. You are tempted to read down the list of variable names without looking at their types.

public class FitNesseExpediter implements ResponseSender
{
    private   Socket          socket;
    private   InputStream     input;
    private   OutputStream    output;
    private   Request         request;
    private   Response        response;
    private   FitNesseContext context;
    protected long            requestParsingTimeLimit;
    private   long            requestProgress;
    private   long            requestParsingDeadline;
    private   boolean         hasError;
    public    FitNesseExpediter(Socket          s, 
                                FitNesseContext context) throws Exception
    {
        this.context =            context;
        socket =                  s;
        input =                   s.getInputStream();
        output =                  s.getOutputStream();
        requestParsingTimeLimit = 10000;
    }
}

Automatic reformatting tools usually eliminate this kind of alignment.

Code excercises

Thank you for your attention.