1 Some Racket, First Design

1.1 Racket Mechanics

Racket is an untyped functional programming/scripting language. DrRacket is its primary IDE but if you don’t like it, you can find EMACS and VI modes for Racket on the web.

In Racket you define and use functions. Period.

Here is how you define a polynomial:

(define (P x)
(+ (* 3 (expt x 5)) (* -7 (expt x 2)) 42))

The define keyword introduces a function definition, (P x) is the function header.

Like all languages, Racket comes with many forms of conditionals. We rely on just one form for now: cond. If you recall the sign function from mathematics,

you can easily see how convenient cond is in Racket. The same function in Racket looks like this:

(define (sign x)
  (cond
    [(> x 0) 1]
    [(= x 0)  0]
    [(< x 0) -1]))

Each "line" in cond consists of a condition and an answer. The condition is a booleanWell, not really. Anything that isn’t false (#f) is considered true. expression over the parameters; the answer is any kind of expression. To formulate boolean expressions, you use predicates (say zero? or number?) and relations (< or <=).

Racket has many kinds of data that you know from other programming languages: a complete number hierarchy, including complex numbers; booleans (#t and #f); strings ("hello world"); chars; etc.

Naturally a Racket programmer can also create new forms of data. To do so, it suffices to define a structure, like this:

(struct s (f g h) #:transparent)

The keyword struct introduces a structure definition; its name follows immediately (s here); this is followed by a parenthesized sequence of field names (f, g, and h in this example). A structure definition comes with many options; the option #:transparent enables collaboration with the unit testing library.

The point of a struct definition is to define functions. What else? As far as you are concerned, a struct definition creates these functions every time it is evaluated:

a constructor, s in the running example, which consumes as many values as there are fields and creates a structure instance that belongs to this specific evaluation of the structure definition;
a predicate, s? in the example, which consumes one value and determines whether it is an instance of this particular evaluation of a structure definition; and
as many structure selectors, s-f, s-g, and s-h here, which consume one value, test whether it is an instance of the structure, and – if so – extract the value in the specified field.

That’s all.

DrRacket provides an interactive program development environment. It comes with fewer and more amenities than some other conventional IDEs. It certainly lacks code wizards that turn programming into a guessing game. The most prominent features for you are:

the control panel;
the definitions area;
the interaction area.

The latter provides a so-called read-eval-print loop.Many people call this REPL an interpreter. This is wrong. It is really just a mechanism for evaluating expressions and statements one at a time; the evaluation may exploit interpretation or compilation. See, you already learned something. A read-eval-print loop "reads" an expression at a time, evaluates it, and prints the result. That’s all.

The interaction area is quite convenient for experimentation. Here are some experiments:

> (+ 1 2 3 4 5 6 7 8 9 0)
45
> (- 1 2 3 4)
-8
> (sin 0)
0
> (sqrt -1)
0+1i
> (string-length "hello world")
11

If your definitions area contains some definitions and you click the RUN button, your expressions are evaluated in the context of these definitions. Thus, if you defined sign (see above) in your definitions area, you could conduct the following interactions in DrRacket after clicking RUN:

> (sign 55)
1
> (sign -9)
-1
> (sign 0)
0

1.2 Design Recipe

As far as you are concerned, design has two dimensions: process and the organization of data definitions. Let’s look at the design process first. We call the process a design recipe, and here are the six mandatory steps:

Analyze the problem. The analysis should identify the information that you need to represent as data in your chosen language. Since Racket is an untyped language, you can use structured English to document this decision. The result is called a data definition.
A data definition names and describes a set of elements from the ever-expanding universe of data in Racket.
The problem analysis should also produce a concise description of the program’s task. One line of precise English suffices; keep in mind that you should describe what the function computes not how it computes it.
In addition, formulate a function signature using the data definitions from step 1 and the built-in sets of data.
A concise purpose statement as in step 2 is too abstract to provide much insight into the concrete workings of the function. It is therefore important to illustrate the purpose statement with examples. To be precise, the examples should illustrate what output functions should produce for which inputs.
To this end you need to generate input data. Use the data definition. It tells you whether some piece of data belongs to a chosen set, and it also tells you how to generate input examples. (If it doesn’t, it isn’t a data definition.) The organization of the data definition implicitly suggests which kinds of inputs you should consider.
Also use the problem statement. It may point out special cases and you need to understand what they should do.
If you cover a good part of the input space, you will get a good understanding of the capabilities and limits of the function. When anything is ambiguous, ask the "customer" for a specification of the problem statement now.
You can now turn the data definition for the inputs into a template, that is an outline of the program. This is not some informal pseudo-code essay, but a transformation of all the knowledge in the data definition into a code layout.
Note that all template for functions with the same input signature are identical. The construction of the template does not exploit any knowledge about the output or the functionality of the program.
Technically you can create the template right after step 2. Creating input examples may help, however.
Now it is time to code, that is, to define the function.
The template lays out what data the function consumes. Any information that it computes must be the result of combining these pieces of data. The combination can exploit Racket functions, functions you have defined, functions you intend to define—create a stub for those—and nothing else.
Last but not least, you turn the examples into unit tests so that the Racket unit testing framework can run the tests and report problems (if any).
Add the import specification (require rackunit) to the top of the file to load the unit testing framework.

#lang racket

(require rackunit)

#| —————————————————————————–
   PROBLEM STATEMENT

   Design a function that determines whether a Cartesian point
   is within a rectangle in the Cartesian plain. Assume that
   the rectangle’s sides are parallel to the two axes. |#

;; DATA DEFINITION

(struct point (x y) #:transparent)
;; A Point is (point Number Number)

(struct rectangle (xmin xmax ymin ymax) #:transparent)
;; A Rectangle is (rectangle Number Number Number Number)
;; Assume: (rectangle xmin xmax ymin ymax) implies
;; – (< xmin xmax)
;; – (< ymin ymax)

;; is the Cartesian point p within the rectangle r?
;; Point Rectangle -> Boolean
;; given (point 3 4) and (rectangle 0 10 0 10), the point is in
;; given (point -3 4) and (rectangle 0 10 0 10), the point is out
(define (in? p r)
  (and (<= (rectangle-xmin r) (point-x p) (rectangle-xmax r))
       (<= (rectangle-ymin r) (point-y p) (rectangle-ymax r))))

(check-equal? (in? (point 3 4) (rectangle 0 10 0 10)) #t)
(check-equal? (in? (point -3 4) (rectangle 0 10 0 10)) #f)
Figure 1: Is the point inside the rectangle?

Figure 1 displays a sample problem and the complete code to solve it. The template for a function that consumes a Point and a Rectangle is this:

(define (in? p r)
  ... (point-x p) ... (point-y p) ...
  ... (rectangle-xmin r) (rectangle-xmax r) ...
  ... (rectangle-ymin r) (rectangle-ymax r) ...)

The idea is that a function must produce the answer from the data inside the given structure instances (or perhaps from the entire structs if there is a function that performs the work). So the template lists all these pieces of data. When you go from templates to functions then, your creative task is to pick functions that combine the available pieces of data.

#lang racket

(require rackunit)

#| —————————————————————————–
   PROBLEM STATEMENT

   Design a function that determines whether a Cartesian point
   is within some given shape. A shape can be either a
   rectangle whose sides are parallel to the two axes or a
   circle whose extent is determine by a center and a radius.

|#

;; DATA DEFINITION

(struct point (x y) #:transparent)
;; A Point is (point Number Number)

(struct rectangle (xmin xmax ymin ymax) #:transparent)
(struct circle (x y r))
;; A Shape is one of:
;; – (rectangle Number Number Number Number)
;; – (circle Number Number Number)

;; is the Cartesian point p within the shape s?
;; Point Shape -> Boolean
;; (circle 0 0 5) and (point 3 4) is in
;; (circle 0 0 5) and (point 10 10) is out
(define (in? p s)
  (cond
    [(rectangle? s) (rectangle-in? p s)]
    [(circle? s) (circle-in? p s)]))

;; Point Rectangle -> Boolean
;; is the Cartesian point p within the rectangle r?
(define (rectangle-in? p r)
  #f)

;; Point Circle -> Boolean
;; is the Cartesian point p within the circle s?
(define (circle-in? p s)
  #f)

(check-equal? (in? (point 3 4) (circle 0 0 5)) #t)
Figure 2: Is the point inside a circle or rectangle?

Figure 2 is about a data definition that unions two disjoint sets: the set of circle representations and the set of rectangle representations. Every language has some way of representing unions; in an untyped language such as Racket the programmer uses English to describe the union.

The template for a union-based data definition demands a case distinction. To be precise, it demands a case distinction that has as many branches as there are distinct subsets in the unions:

(define (in? p s)
  (cond
    [(rectangle? s)
     ... (point-x p) ... (point-y p) ...
     ... (rectangle-xmin s) (rectangle-xmax s) ...
     ... (rectangle-ymin s) (rectangle-ymax s) ...]
    [(circle? s)
     ... (point-x p) ... (point-y p) ...
     ... (circle-x s) (circle-y s) (circle-r s) ...]))

Since both cases deal with structs, both answer portions of the cond list all possible fields in the two structs.

As figure 2 shows, however, the definition itself may ignore this knowledge about the program. When functions or cond lines get large—say three to five lines—it is much better to formulate a "wish list" of auxiliary (helper) functions and to hand (some of) the data to these helper functions. The figure illustrates how such "wishes" are formulated as stubs that give a constant answer and are thus guaranteed to fail some of the unit tests.

#lang racket

(require rackunit)

#| —————————————————————————–
   PROBLEM STATEMENT

   Design a function that determines whether a Cartesian point
   is within some given shape. A shape can be either a
   rectangle (whose sides ...); a circle (whose extent ...) or
   a combination of two shapes.
|#

;; DATA DEFINITION

(struct point (x y) #:transparent)
;; A Point is (point Number Number)

(struct rectangle (xmin xmax ymin ymax) #:transparent)
(struct circle (x y r) #:transparent)
(struct union (top bot) #:transparent)
;; A Shape is one of:
;; – (rectangle Number Number Number Number)
;; – (circle Number Number Number)
;; – (union Shape Shape)

;; Point Shape -> Boolean
;; is the Cartesian point p within the shape s?
;; examples:
;; ...
;; (point 9 9) NOT in (union (circle 0 0 5)(rectangle 0 1 0 1))
;; (point 8 8) is in (union (circle 0 0 5) (rectangle 8 9 7 8))
;; (point 3 4) is in (union (circle 0 0 5) (rectangle 8 9 7 8))

;; TEMPLATE
(define (in? p s)
  (cond
    [(rectangle? s)
     ... (point-x p) ... (point-y p) ...
     ... (rectangle-xmin s) (rectangle-xmax s) ...
     ... (rectangle-ymin s) (rectangle-ymax s) ...]
    [(circle? s)
     ... (point-x p) ... (point-y p) ...
     ... (circle-x s) (circle-y s) (circle-r s) ...]
    [(union? s) ... p ..
                (in? p (union-top s))
                (in? p (union-bot s)) ...]))
Figure 3: Is the point inside the shape?

Figure 3 shows the next logical step: a self-referential data definition. This kind of definition is also called inductive. Unlike any definition of plain English terms, this kind of definition is circular and yet it makes sense. For a self-referential definition to make sense, it is necessary to generate examples and to ensure that the examples represent the desired kind of information.

Here are two straightforward examples of Shape:

(rectangle 0 1 0 1)
(circle 0 0 5)

The first is generated from the first line in the data definition, and the second one is generated from the second line. Now that there are two shapes, it becomes possible to use the third clause of the data definition to create four more examples of Shapes:

(union (rectangle 0 1 0 1) (circle 0 0 5))
(union (circle 0 0 5) (rectangle 0 1 0 1))
(union (circle 0 0 5) (circle 0 0 5))
(union (rectangle 0 1 0 1) (rectangle 0 1 0 1))

These four union shapes just combine the two existing shapes in all possible ways. But now there are four more shapes and it becomes possible to create even larger union shapes, e.g.,

(union
(union (rectangle 0 1 0 1) (rectangle 0 1 0 1))
(rectangle 0 1 0 1))

It is easy to see that self-referential data definitions allow the creation of arbitrarily large pieces of data—and that is what makes programming interesting. All interesting programs that deal with data unbounded in size rely on the existence of self-referential data definitions, whether the programmers who created them know and understand this fact or not. Programmers who do understand this fact are better off than those who don’t.

The template in the figure reflects one-to-one this data definition:

it consists of three cond lines, because the data definition deals with three kinds of sets;
each cond line uses a predicate for the corresponding structure to distinguish it from other cases;
each cond line contains selectors for the two structs; and
the last cond line contains recursive function calls for exactly those places where the data definition contain anchors for self-references.

The last point in figure 3 is the key innovation over figure 2. And it makes programming interesting!

1.3 Template

It is possible to formulate the actions that lead from a data definition to a template in a collection of questions and suggestive answers.

Does the data definition distinguish among different subclasses of data?
Your template needs as many cond clauses as subclasses that the data definition distinguishes.
How do the subclasses differ from each other?
Use the differences to formulate a condition per clause.
Do any of the clauses deal with structured values?
If so, add appropriate selector expressions to the clause.
Does the data definition use self-references?
Formulate ``natural recursions'' for the template to represent the self-references of the data definition.
Some data definitions refer to others. For those, use this advice:
Does the data definition refer to another data definition?
Contemplate the development of a separate template. See HtDP/2e.
Figure 4: How to translate a data definition into a template

← prev up next →

1	Some Racket, First Design
2	Some More Design, Some More Racket
3	Design with Accumulators

Does the data definition distinguish among different subclasses of data?	Your template needs as many cond clauses as subclasses that the data definition distinguishes.
How do the subclasses differ from each other?	Use the differences to formulate a condition per clause.
Do any of the clauses deal with structured values?	If so, add appropriate selector expressions to the clause.
Does the data definition use self-references?	Formulate ``natural recursions'' for the template to represent the self-references of the data definition.