8.3.0.10

Lecture 8: Code Reviews, Code Walks, Code Inspections

Today

Lecture

Software comes with two sets of expensive flaws, even after we have two minds working on a piece of code:
  • The first set of really costly software mistakes concerns (unintentional) violations of complex but unwritten logical invariants.

  • The second set of costly mistakes concerns bad designs, i.e., code that cannot be understood by the next person who reads it.

What kind of "cost" is involved?

The reason we have those problems is that two people can easily fall into the “group think” trap, especially after they have worked together for a substantial amount of time. Rotating people through pair-programming partnerships reduces this “group think” attitude; sometimes it just increases it. Every so often a piece of code needs a fresh pair of eyes.

Code Reviews, Code Inspections, Code Walks

My impression is that these devices lead to occasional flare-ups because people don’t see each other and don’t look each other in the face when they make seriously negative statements. But this seems to be extremely rare.

Over the past decade, software developers have accepted this idea of third-party reviews, though a rather shallow form of it. Technically, they ask colleagues to read the code in repositories that allow mark-ups, say github. These people leave a few comments, though conversations with software developers suggest that these comments are rarely deep and almost always “nice.” The problem is that such automated intermediaries eliminates the constructive face-to-face confrontation, conversations that truly reveal flaws before they snowball into serious problems.

At the other end of the spectrum, people developed code inspections. In the 19980s, both IBM’s and ATT’s labs studied how to inspect code deeply to find bugs. During a code inspection, a code reader reads the code aloud to a panel consisting of its creator, an inspector, and a secretary. The reader, inspector, and secretary discuss what they see and what is read aloud—and the developer quietly listens. While this can be a painful experience for the developer, researchers have demonstrated repeatedly in a variety of setting that code inspections uncover many more mistakes than any other technique. The downside is that it doubles the development time and increase the production cost; it is difficult to estimate how much the removal of these bugs saves in future costs.

Here are some of the questions that the researchers posed and answered in a systematic way with experiments:

The studies also discuss various “mechanics” for an inspection, but these details don’t matter. They used a "confrontational" approach, placing the reviewers opposite of the presenters. Nowadays people use (occasionally anonymous) github mark-ups and on-line discussion boards.

In this course, we mix and match techniques from across this spectrum. A pair of programmers jointly presents their code. Only one of them speaks and converses with the panelists; each member of a pair must be in complete command of the code. A three-person panel “reads” the presented code:
  • The first is the head reader whose task it is to direct the code walk, to make decisions when conflicts arise, or to intervene when too many people talk at once.

  • The second is the assistant reader, because two pairs of eyes are better at reading code than one. The assistant has no other tasks than reading code.

  • The third one is a secretary whose task it is to take notes. The result of these notes is a list of design flaws and bugs; there is no need to describe solutions though if some were discussed, a side note may be appropriate.

The panelists do not have to read the code beforehand. They must have a complete understanding of the problem, and they may have thought about solutions though perhaps in a different setting than the presenters. Both of these prerequisites apply in this course.

The goal of the code walk is to uncover design flaws and bugs. Hence it is important that the presenters focus on the complicated, not the boring parts of their code. They can start with an purpose statement—what is being reviewed—and an overview—what are the major pieces, which one are the focus of the review.

For the presentation of individual data representations and methods/functions, the presenters can follow the design recipe. It is a good guide to bring across a structure—unless it was ignored during development.

The presentation turns into a conversation when panelists spot flaws. A presenter must not accept every criticism but must also recognize when criticism is appropriate. This back-and-forth calls for a delicate balance. Like the presenters, the panelists can use the design recipe to question code systematically.

Pair Programming and Code Walks

We will downgrade every partner of a pair if we discover that one of the partners is not up to speed with the code.

Homework E: Show and Tell

  1. Explain the homework problem.

    So far our exploratory programs consumed input from STDIN and sent input to STDOUT. Both of these are stream devices, meaning we can think of them as (strange) lists.

    TCP input/output is another form of streaming devices. When web front ends interact with web backends, they often use TCP (or a protocol atop) for the communication layer.

    This homework does it too. You will write a server, re-direct the TCP input stream to the code of assignment C, and point the result to the TCP output stream.

    You will not write a client. Instead you will use a program called netcat to connect to your server, running on the same machine or some other machine.

  2. Explain Racket solution.