HackNot - Programming

Part of Hacknot: Essays on Software Development

Get Your Filthy Tags Out of My Javadoc, Eugene1

Recently I’ve been instituting a code review process on a number of projects in my workplace. To kick start use of the process, I took a sample of the Java code written by each of my colleagues and reviewed it.

While doing so I was struck by the degree of individual variation in the use of Javadoc comments, and reminded of how easy it is to fulfill one’s obligation to provide Javadoc without really thinking about how effectively one is actually communicating.

I think the quality of Javadoc commenting is important because - let’s be honest - it’s the only form of documentation that many systems will ever have.

Here are some of the problems in Javadoc usage that I frequently observe:

  • Developers never actually run the Javadoc utility to generate HTML documentation, or do so with such irregularity they can have no confidence that their copy of the HTML documentation is up to date.
  • Developers use their IDE’s facility to autogenerate a comment skeleton from a method signature, but then fail to flesh out that skeleton.
  • HTML tags are overused, severely impairing the readability of comments when viewed as plain text.
  • Comment text is diluted with superfluous wording and duplication of information already conveyed by data types.
  • Valuable details are omitted e.g. method preconditions and post-conditions, the types of elements in Collections and the range of valid values for arguments (in particular, whether an object reference can be null).
  • The conventional single sentence summary at the beginning of a method header comment is omitted.
  • Non-public class features are not commented.

My conclusion is that many developers are just “going through the motions” when writing Javadoc comments. With a little more thought, more effective use of both the author’s and the reader’s time can be made.

I propose the following guidelines for effective Javadoc commenting …

Do Not Use HTML Tags

This maximizes the readability of the comment when viewed in situ, and saves the author some time (which is better spent adding meaningful text to the comment). Use simple typographic conventions2 to create tables and lists.

Javadoc All Class Features, Regardless Of Scope

While third parties using your code as an API don’t need it, the developers and maintainers of your code base do - and they are your principal audience.

Don’t Prettify Comments

Cute formatting such as lining up the descriptions of @param tags wastes space you could devote to meaningful description and makes the comments harder to maintain.

Drop The Description For Dimple Accessors

For methods that simply set or get the value of a class attribute, this sentence duplicates the information contained in an @param or @return clause respectively.

Assume Null Is Not OK

Adopt the convention that object references can not be null unless otherwise stated. In the few circumstances where this is not true, specifically mention that null is OK, and explain what significance the null value has in that context.

Use Terse Language

Feel free to use phrases instead of full sentences, in the interest of brevity. Avoid superfluous references to the subject like “This class does …”, “Method to …”, “An integer that …”, “An abstract class which …”.

Be Precise

  • For classes: precisely describe the object being modeled.
  • For methods: describe the range of valid values for each @param and =@return=.
  • For fields: describe the types of objects in Collections and the range of valid values.

Naming Classes: Do it Once and Do it Right3

The selection of good class names is critical to the maintainability of your application. They form the basic vocabulary in which developers speak and the language in which they describe the code’s every activity. No wonder then that vague or misleading class names will quickly derail your best efforts to understand the code base.

Because we are called on to invent class names so frequently, there is a tendency to become somewhat lackadaisical in our approach. I hope the following guidelines will assist you in devising meaningful class names, and encourage you to invest the effort necessary to do so. As always, these are just guidelines and ultimately you should use your own discretion.

1. A Class Name Is Usually A Noun, Possibly Qualified

The overwhelming majority of class names are nouns. Sometimes you use the noun by itself:

  • Image
  • List
  • Position
  • File
  • Exception

Other times you qualify the noun with one or more words which help to specialize the noun:

Class Name Grammatical Breakdown ——————– —————————————————————————— JPEGImage The noun Image is qualified by the noun JPEG LinkedList The noun List is qualified by the adjective Linked ParsePosition The noun Position is qualified by the verb Parse RandomAccessFile The noun File is qualified by the adjective Random and the verb Access FormException The noun Exception is qualified by the noun Form

When searching for a noun to serve as a class name, consider the following suffixes which are often used to form nouns from other words:4

Suffix Example Class Names ——– —————————————————– -age Mileage, Usage -ation Annotation, Publication, Observation -er User, Broker, Listener, Observer, Adapter -or Decorator, Creditor, Author, Editor -ness Thickness, Brightness, Responsiveness -ant Participant, Entrant -ency Dependency, Frequency, Latency -ion Creation, Deletion, Expression, Enumeration -ity Plasticity, Mutability, Opacity -ing Tiling, Spacing, Formatting -al Dismissal, Removal, Committal

2. Avoid Class Names That Have Non-Noun Interpretations

Suppose that while maintaining an application you come across a class called Empty. As a noun, instances of Empty might represent a state in which some vessel is devoid of contents. However the word “empty” can also function as a verb, being the act of removing all the contents of a vessel. So there is potential confusion as to whether the class models a state or an activity. This ambiguity would not arise if the class had been called EmptyState or EmptyActivity.

3. A Class Name Is Sometimes An Adjective

There is a special type of class called a structural property class5, which is often named with an adjective. Such classes exist to confer specific structural properties upon their subclasses (or implementers, in the case of interfaces). They are often suffixed with -able. Examples include:

  • Comparable
  • Undoable
  • Serializable
  • Printable
  • Drawable

4. Use Commonly Accepted Domain Terminology

Specialist domains come ready-made with their own vernacular. This can be both a curse and a blessing. The down side is that newcomers to the domain have a lot of new terminology to master. The up side is that, once mastered, that terminology makes for efficient and precise communication with others fluent in the domain’s jargon. Incorporating domain terminology in your class names is a good idea, as it succinctly communicates a lot of information to the reader. But you must be careful to use only terminology that is commonly known and has a precise definition, and ensure that your usage of the term is consistent with that definition. Avoid region-specific slang and colloquialisms. Examples:

  • DichotomousItem
  • CorrigendaSection
  • DeweyDecimalNumber
  • AspectRatio
  • OrganicCompound

5. Use Design Pattern Names

Incorporating design pattern names like Factory, Proxy and Singleton into your class names is a good idea, for the same reasons that it is useful to use terminology from the application domain – because a lot of information is communicated succinctly. Just be careful not to get pattern-happy, and start thinking “everything is an instance of some pattern.” Only refer to design pattern names if they have direct relevance to the intrinsic nature of the class. Examples:

  • ConnectionFactory
  • ClientProxy
  • AccountObserver
  • DocumentBuilder
  • TableDecorator

6. Aim For Clarity Over Brevity

Many developers demonstrate a form of scarcity thinking when it comes to naming classes – as if there were a shortage of characters in the world and they should be conserved. The days when we needed to constrain identifiers to particular length restrictions are long gone. Today we should be focused upon selecting class names that communicate effectively, even if at the expense of a little extra length. With many developers using IDEs that support auto-completion, the traditional arguments in favor of abbreviation (typographical error and typing effort) are no longer applicable. The one case where abbreviation is warranted is specialist acronyms that are commonly used in the application CMOSChip is clearer than ComplimentaryMetalOxideSemiconductorChip. Examples:

  • ProductionSchedule is clearer than ProdSched
  • LaunchCommand is clearer than LaunchCmd domain
  • ThirdParty is clearer than ThrdPrty
  • ApplicationNumber is clearer than AppNum
  • SystemCorrespondence is clearer than SysCorro

7. Qualify Singular Nouns Rather Than Pluralize

When a class represents a collection of some type, it can be tempting to name it as the plural of the collected type e.g. a collection of Part classes might be called Parts. Although correct, you can communicate more about the nature of the collection by using qualifying nouns such as Set, List, Iterator and Map. Examples:

Class Name Group Semantics ———— —————————————————————– PartList Parts are ordered PartSet Parts are unordered and each Part can not appear more than once PartPool Parts are interchangeable

8. Find Meaningful Alternatives To Generic Terms

Terms like Item, Entry, Element, Component and Field are very common and rather vague. If these terms really are the standard terminology in your application domain then you should use them. But if you are free to use class names of your own invention then search for something more specific and meaningful.

9. Imply Relationships With Other Classes

Naming a class provides you with the opportunity to communicate something about that class’s relationship with other classes in the application. This will help other developers understand that class’s place in a broader application context.

Some techniques that may be helpful in this regard:

  • Use the name of a super-class or interface as a suffix e.g. call implementations of the Task interface PrintTask, ExecuteTask and LayoutTask.
  • Prefix the name of abstract classes with the word Abstract.
  • Name association classes by pre-pending and appending the class names on either side of the association e.g. the association between Student and Test could be called StudentTakesTest.

In Praise of Code Reviews6

I have a woeful sense of direction — the navigational abilities of a lemming combined with the homing instinct of a drunk. But like much of my gender, I continue to entertain the fantasy that I possess an instinctive ability to find my way, an evolutionary artifact of the male’s traditional role as the hunter; an unerring inner compass that will guide me safely through the hunt of everyday life, despite voluminous evidence to the contrary. It is a fantasy that gets me in trouble on a regular basis.

Whenever I am driving to somewhere new, the scenario generally plays out like this: I begin the journey looking through my street directory, tracing out the path I need to follow. After memorizing the first few turns I set the directory down and depart, resolving to stop and consult the directory again once I’ve completed those turns. Within a few minutes I have traveled over the first part of the journey that I’ve already memorized, and have reached a decision point. Will I pull over to the side of the road and reacquire my bearings as planned, or will I just follow my nose? Invariably, I choose the latter.

“I’m bound to see a relevant sign before too much longer,” I think. And so I drive on, keeping an eye out for the anticipated sign. If it doesn’t shortly appear, I begin to make speculative turns based on my own “gut feeling” about which way to head. If I’m heading to a popular destination, I might simply follow the path I perceive most of the traffic is taking, figuring that they’re all probably headed to the same place as I am. Through a combination of guess-work, dubious reasoning and random turns I eventually reach the point where I have to admit to myself that I’m lost. Only then will I pull over to the side of the road, get the street directory out of the glove compartment to find out where I am and how to get to my original destination from here.

This insane behavior has been a characteristic of my driving for many years. It usually manifests when I am driving home alone from some event which has left me feeling tired and distracted. I slip into a worn out fugue, adopt a “she’ll be right” attitude and head off to goodness-knows-where. About a year ago, driving home from a job interview in a distant city, I strayed off course by over 100 kilometers – all the while resolutely refusing to pull over and consult my directory, which I could have done at any time.

Thanks to these unexpected excursions, I have seen parts of the country side that I might otherwise have missed, but I have no idea where they were or how to get back there.

So why do I do it? Why not spend five minutes by the side of the road working out where I’ve been and where I’m going, rather than just keep driving aimlessly in hope of finding some visible prompt to get me on course? As strange as the habit is, I think it’s exactly the same behavior that many people exhibit when they make self-defeating decisions. It stems in part from short-term thinking.

Driving along in my pleasant reverie, I am faced with a choice. Stopping to consult my street directory will require some mental energy. I’ll have to break the flow of my journey, find a significant landmark or intersection, locate it in the directory, and re-plot a path to my destination. The alternative is just to keep drifting along and hope for the best. If your scope of consideration is only the next few minutes, then it’s very easy to decide to avoid the short-term inconvenience of pulling over in favor of continuing to do what you’re already doing – even though it isn’t working out and has already got you into difficulty.

A smoker indulges in similar thinking every time they light up. They know full well that they’re killing themselves by having that next cigarette, but considering only the next five minutes, what is easier: Resisting the craving for a cigarette, or giving in?

This desire to minimize small, short-term pain even at the expense of significantly more pain in the long term is at the core of much self-defeating behavior.

We’ll return to this theme in a moment. But first, a short divergence on code reviews.

Code Reviews

For many types of work it is standard practice to have one’s work checked by another before the work product is put into service. Authors have editors; engineers have inspectors and so on. But in software development it is common for code to flow directly from the programmer’s fingertips into the hands of the end users without ever having been seen by another pair of eyes.

This is despite there being a large body of empirical evidence establishing the effectiveness of code review techniques as a device for defect prevention. Since the early history of programming, a number of different techniques for reviewing code have been identified and assessed. A code walkthrough is any meeting in which two or more developers review a body of code for errors. A code walkthrough can find anywhere between 30 and 70 percent of the errors in a program7. Code reading is a more formal process in which printed copies of a body of code are distributed to two or more reviewers for independent review. Code reading has been found to detect about twice as many defects as testing8. Most formal of all is the code inspection, which is like a code walkthrough where participants play pre-defined roles such as moderator, scribe or reviewer. Participants receive training prior to the inspection. Code inspections are extremely effective, having been found to detect between 60 and 90 percent of defects9. Defect prevention leads to measurably shorter project schedules. For instance, code inspections have been found to give schedule savings of between 10 and 30 percent.

I estimate that about 25 percent of the projects I have worked on conducted code reviews, even though 100 percent of them were working against tight schedules. If we can save time and improve quality with code reviews, why weren’t the other 75 percent of projects doing them?

I believe the answer is mostly psychological, and the basic mechanism is the same one that I engage in every time I go on one of my unplanned excursions in my car. The essential problems are short-term thinking, force of habit and hubris.

Suppose you have just finished coding a unit of work and are about to check it into your project’s version control system. You’re faced with a decision – should you have your code subjected to some review procedure, or should you just carry on to the next task? Thinking about just the next five minutes, which option is easier? On the one hand you’ll have to organize the review, put up with criticism from the reviewers, and probably make modifications to your code based upon their responses. On the other hand, you can declare the task “finished’, get the feeling of accomplishment that comes along with that, and be an apparent step closer to achieving your deadlines. So you make the decision which minimizes discomfort in the short term, the same way I decide to just keep on driving in search of a road sign rather than pull over and consult my street directory.

But then, you’ve got to rationalize this laziness to yourself in some way. So you reflect on previous experience and think “I’ve gotten away with not having my code reviewed in the past, so I’ll almost certainly get away with it again”. Similarly, I’m driving along thinking “I’ve never failed to eventually get where I’m going in the past, so I’ll almost certainly get there this time as well.” Complacency breeds complacency.

Finally, although it is difficult to admit, there is some comfort in not having your code reviewed by others. We would like to think that we can write good code all by ourselves, without the help of others, so avoiding code reviews enables us to avoid confronting our own weaknesses. In the same way, by following my nose rather than following my street directory, I can avoid having to confront the geographically exact evidence of my hopeless sense of direction that it will provide. Ignorance is bliss.

Even when you quote the empirical evidence to programmers, many will still find a way to excuse themselves from performing code reviews, by assuming that the touted reductions in schedule and improvements in quality were derived through experimentation upon lesser developers than themselves. The thinking goes something like “Sure, code reviews might catch a large percentage of the defects in the average programmer’s work, but I’m way above average, don’t write as many defects, and so won’t get the same return on investment that others might.” Unfortunately it is very difficult to tell simply by introspection whether you really are an above average programmer, or whether you just think you are. Most people consider that they are “above average” in ability with respect to a given skill, even though they have little or no evidence to support that view. For example, most of us consider ourselves “better than average drivers”. The effect is sometimes referred to as self-serving bias or simply the above average effect.

Those that have bought into the Agile propaganda (can we call it “agile-prop”?) may have been deceived into thinking that pair programming is a substitute for code reviews. To the best of my knowledge, there is no credible empirical evidence that this is the case. In fact, there are good reasons to be highly skeptical of any such assertions – in particular, that a pair programmer does not have the independent view of the code that a reviewer uninvolved with its production can have. Much of the benefit of reviews comes from the reviewers different psychological perspective on the product under review, the fact that they have no ego investment in it, and that they have not gone through the same (potentially erroneous) thought processes that the original author/s have done in writing it. A pair programmer is not so divorced from the work product or the process by which it was generated, and so one would expect a corresponding decrease in ability to detect faults.

So we sustain self-defeating work practices the same way we sustain many other sorts of self-defeating behavior – by lying to ourselves and putting long term considerations aside.

Do Code Reviews Have A Bad Reputation?

There is perhaps another factor contributing to a hesitance to perform code reviews, which is the reputation they have as being confrontational and ego-bruising experiences. This reputation probably springs from consideration of the more formal review processes such as code inspections, in which the reviewing parties can be perceived as “ganging up” on the solitary author of the code, subjecting them to a famously unexpected Spanish Inquisition.

This is a legitimate concern, and it is certainly easy for a review of code to turn into a review of the coder, if a distinct separation is not encouraged and enforced. I therefore recommend that code reviews be conducted by individual reviewers in the absence of the code’s author. This tends to depersonalize the process somewhat, and remove some of the intimidatory effect that a group process can have. There is in fact some evidence to suggest that an individual reviewer is no less effective than a group of reviewers in detecting faults in code.

The code can be printed out and written comments attached to it, or comments can be made in the source file itself, perhaps as “TODO” items that can be automatically flagged by an IDE. Personally, I prefer paper-based reviews because a paper-based review system is quick and easy to institute, and equally applicable to reviews of written artifacts such as design and requirements documents.

Conclusion

There is much to recommend the practice of conducting code reviews on a regular basis, and few negatives associated with them, provided they are conducted sensitively and with regard for the feelings of the code’s author. All it takes is for one other programmer on your team to be willing to undertake the task, and you can establish a simple code review process that will likely produce noticeable benefits in improved code quality and reduced defect counts. Not everyone is good at reviewing code, so if you have the option, have your code reviewed by someone who demonstrates an eye for detail and is known for their thoroughness. If you have the authority to do so, it is well worth incorporating code reviews into your team’s development practice, perhaps as a mandatory activity to be undertaken before new code is committed to the code base, or perhaps on a random basis. It may also be worthwhile to have junior staff review the code written by their more experienced counterparts, as a way of spreading knowledge of good coding techniques and habits.

When introducing code reviews, you will likely encounter some initial resistance, simply because the short-term thinking which has so far justified their absence is a habit that is superficially attractive and requiring of a certain determination to break. However, once they have had the opportunity to participate in code reviews, many programmers will concede that it is a habit worth forming.


  1. First published 6 Aug 2003 at http://www.hacknot.info/hacknot/action/showEntry?eid=14 

  2. http://docutils.sourceforge.net/rst.html 

  3. First published 9 Mar 2004 at http://www.hacknot.info/hacknot/action/showEntry?eid=48 

  4. Bloomsbury Grammar Guide, Gordon Jarvie 

  5. Object Oriented Software Construction, 2nd Edition, Bertran Meyer 

  6. First published 27 Feb 2006 at http://www.hacknot.info/hacknot/action/showEntry?eid=83 

  7. Rapid Development, Steve McConnell, pg 70, citing Myers 1979, Boehm 1987b, Yourdon 1989b 

  8. Ibid, pg 71, citing Card 1987 

  9. Ibid, pg 71