7.7.0.3

17 — CPS

Tuesday, 10 March 2020

Presenters (1) Erin Burba & Joshua Goldman, (2) Emmanuel Manolios & Benjamin Quiring

Status Report

Thus far, we have developed a small language for modeling several key aspects of conventional programming languages. Here again is the abstract grammar of the language:
AssExpr  =
;  every language has basic constants and operations on them:
  Int ||
  (node O AssExpr AssExpr) ||
  (if-0 AssExpr AssExpr AssExpr) || ; includes bounded local control
;  languages have lexcially scoped variables:
  Var ||
  (decl Var AssExpr AssExpr) ||
;  languages have higher-order functions and objects (multi-entry functions)
  (fun Var AssExpr) || ; also models unbounded local control
  (call AssExpr AssExpr) ||
;  and most of the languages you use have mutation by default
  (set  Var AssExpr) ||
  (sequ AssEexpr AssExpr)

Our model can thus explain the following notions of computational language constructs:

visible

     

meaning

literal constants and ops

     

arithmetic

expressions

     

value

scope of a variable binding

     

environment

assignable variable

     

location

mutation effects

     

store

With “explain” we mean not just use the same concept in the implementation language.

The “convex hull” of these concepts include many you might not immediately recognize. Java has array constants, Python has list constants and operations on them. Some of these operations use mutation. You can model these ideas with the concepts introduced thus far. Combine “arithmetic” and store-allocating constants such as @ with several different slots. Similarly, the notion of a store includes input and output streams. An input stream is a cell that contains a sequence of things from which we can remove the front; an output stream is in a cell that we mutate when we append a new output to the end.

Types are orthogonal to the above space of ideas. They represent claims about computational language constructs, and they impose restrictions. They do not add to the space per se.

Still if you think about your homework and the languages you use, one missing idea should immediately come to mind.

The above completely excludes non-local control.

In your homework you have used at least two such computational constructs: signalling a fatal error and raising/catching exceptions. When our interpreter discovers an erroneous situation, say the application of a number to some arguments, the interpreter uses the error signaling mechanism of the implementation language to stop the computation and signal a problem.

Concretely, we don’t even have an explanation for the simplest non-local control operation: stop (the program). You may also wish to understand constructs such as Java’s try/catch exception mechanism, a JavaScript async wait, a Python-style generator, etc. This is what we mean with non-local control.

Approach

In principle, we could change the interpreter. We won’t. See the note at the end. Instead, we will tackle this problem with two different approaches.

The first approach will focus on a source-to-source transformation. That is, over the next couple of lectures, we will introduce a function that consumes an AssExpr and produces one—and this resulting one can express non-local control.

The second approach will require a switch from interpreters to abstract machines, a slightly different way of modeling and understanding the meaning of computational language constructs.

Why There are three reasons for using a source-level approach.
  • To this day, JavaScript programmers must employ a manual version of this transformation to implement certain control behaviors. So you should understand the idea well enough that you can implement it yourself.

  • Every programmer must understand the very idea of a source translation and its limitation.

  • In conventional PL courses of this kind, we would apply the transformation to the interpreter, just like the store-passing one. But, some of the programming languages you have chosen would make a transformation at the interpreter level rather awkward. So I’d rather not impose this trouble on you.

The goal of the source transformation is to identify the basic (or atomic) instructions in the order in which they are to be run and to re-arrange the term in such a way that the basic instructions are run in this order.

Continuation

The first step is to identify what “instruction” means and how it relates to the rest of a program.

Consider this expression in our model language:

(node + (node + 1 2) (node + 3 4))

Which sub-expressions gets evaluated first?

Since our language is supposed to evaluate sequences of sub-expressions from right to left, the second addition is the instruction to be run next (and we designed the interpreter to do just that):

(node + (node + 1 2) (node + 3 4))

Let’s split the expression into its two pieces: the one to be computed next and the rest:
(node + 3 4)

          

(node + (node + 1 2) [ ****])
The remainder isn’t quite an expression but an expression with a hole, as indicated by the breacketed sequence of stars. A better way of representing the rest of the computation is with a function:
(node + 3 4)

          

(fun of-right
  (node +
        (node + 1 2)
        of-right))
We refer to such a function as a continuation. In this class, we use instruction for the expression to be evaluated next.

Now that we have separated the two pieces we can ask how we put them back together. And obviously the point is that we call the continuation on the result of the instruction:
[call (fun of-right    ; the continuation
       (node +
        (node + 1 2)
        of-right))
 
      (node + 3 4)]    ; the instruction

What this points out though is that writing down the instruction per se is unsatisfying. It misses the point that there is more left to do—the rest of the computation or the continuation—once we have the result for this one instruction. Here is how we express this idea:
(fun k [call k (node + 3 4)])

          

(fun of-right
  (node +
        (node + 1 2)
        of-right))
That is, we parameterize the basic instruction over the representation of the continuation—a plain function—and call it:
[call (fun k [call k (node + 3 4)]) ; the first instruction
      (fun of-right                 ; its continuation
        (node +
          (node + 1 2)
          of-right))]
If you use the informal laws of high school calculation with functions, you can see that you get the old expression back.

Let’s see how we can continue to apply this transformation. Our next focus has to be the complex expression within the continuation:
(node +
      (node + 1 2)
      of-right)
It is impossible to run the outer addition instruction, so clearly the inner one is the next to be separated out. The separation of the two pieces works like this:
(fun k [call k (node + 1 2)])

          

(fun of-left
  (node +
        of-left
        of-right))

Putting it all together in one place yields this expression:
[call (fun k [call k [node + 3 4]])
      (fun of-right
           [call (fun k [call k [node + 1 2]])
                 (fun of-left
                      [node + of-left of-right])])]
This expression looks complicated, roughly because it is like an machine-level program. The first two highlighted expressions are the basic expressions contained in the original program. The third one is the one that results from pushing the process all the way through.

A second look also tells us that it now does not matter in what order our interpreter evaluates these expressions. Every sub-expression is now either a value—a number, a variable, an addition of those two, or a function—or a call of such a basic sub-expression on another one. From a high-level perspective, this kind of arrangement determines the full ordering of evaluation.

Before we move on, let’s re-consider the original proposition, namely, teasing apart the program into basic instructions and connections between them (continuations). What we ignored is that this nested addition expression may be situated in a program context that is awaiting the expression’s result. But we know how to represent this context:

It is a continuation K0.

What we do have to figure out is where to use this function.

One moment’s reflection tells us that K0 is waiting for the final result so we must insert a call to this function from the “bottom” of our expression:
[call (fun k [call k [node + 3 4]])
      (fun of-right
           [call (fun k [call k [node + 1 2]])
                 (fun of-left
                      [call K0 [node + of-left of-right]])])]
And now we have truly considered all possibilities.

Values and Basic Instructions

The example motivates the need to clarify which sub-expressions are “atomic” within programs. Here we use this term for two different kinds of terms:
  • A value is a term that we can immediately send to a continuation. There is nothing left to compute. Obviously literal constants such as integers are values. So are functions, a point we made with their very introduction into our vocabulary. And we treat variables as values because they stand for values—even if an assignment statement can change which value a variable stands for.

  • A basic instruction is the use of a primitive operation or a function call. Thus, for example, we have these basic instructions:
    [node + 1 2]
    [node * 3 4]
    [call "f" "a"]
    (where f and a are variables).

One form of expression is clearly missing from the above consideration: if-0. So let’s look at two concrete if-0 expressions:
[if-0 [call "f" "a"] "thursday" [call "g" "b"]]
 
[if-0 0 [call "f" "a"] [call "g" "b"]]
In the first expression the next instruction to be evaluated is in the test position, because we need its value. In the second one we have an actual value in the test position, and because of this, we know we must evaluate the first alternative (then-branch) and discard the second one. Generalizing from here, this suggests that we need to separate the test instructions from the rest of the conditional and then consider each branch separately in the context of the entire expression’s continuation.

So let’s look at all expression types and figure out how to separate them into an instruction and a continuation: Convention: a lowercase letter stands for a variable, an uppercase letter for something at least as complex as a basic instruction.

expression

 

what is the instruction

 

what next, k is the continuation

n : Int

 

 

(call k n)

x : Var

 

 

(call k x)

(node o L R)

 

R

 

(fun r
     .. k .. L .. r)

(node o L r)

 

L

 

(fun l
 [call k (node o l r)])

(fun x B)

 

 

(call k (fun x .. B ...))

(call F A)

 

A

 

(fun a (call k .. a))

(call F a)

 

F

 

(fun f .. f .. a .. k)

(seq E1 R)

 

E1

 

(fun e1 .. R .. k)

(if-0 T TH EL)

 

T

 

(fun t
  (if-0 t
      .. TH .. k
      .. EL .. k))

The table suggests two observations.

First, to make this systematic, we clearly need to recursively deal with sub-expressions inside of continuations, especially those surrounded by dots.

Second, we still have a problem case: functions. More precisely, we don’t know how to convert the inside of functions.

Continuation Passing

While functions are values, their (only) use is to call them eventually and then their body sub-expression is evaluated, and we would like to evaluate a version of the body like the one above. The problem is that functions can be called from many different places. Each place represents a different rest of the computation, that is, a different continuation. All we know is that when the function has completed its computation, the result must be send back to the correct continuation.

So assume the function

[fun x [node + [node 1 2] [node 3 4]]]

is called from two sites: K0 and K1. Based on what we have already know from above, the transformation of the body this function must then look as follows:
[call (fun k [call k [node + 3 4]])
      (fun of-R
           [call (fun k
                   [call k
                     [node + 1 2]])
                 (fun of-L
                      [call K0
                            [node + of-L of-R]])])]

 

[call (fun k [call k [node + 3 4]])
      (fun of-R
           [call (fun k
                   [call k
                     [node + 1 2]])
                 (fun of-L
                      [call K1
                            [node + of-L of-R]])])]
The highlight points out that these terms differ in only one place.

Van Wijngaarden suggested this idea as early as September 1964. The fully worked-out continuation-passing idea didn’t get developed until the early 1970s though, and then independently by Fisher, Reynolds, Strachey, and Wadsworth. Steele and Sussman truly put it on the map with implementations in Scheme and for the first Scheme compiler in the late 1970s. This is the same Steele who first specified Java.

Additionally, the highlight should remind you of abstraction of two similar expressions that differ in only one place where distinct values show up:
[fun call_site
     [call (fun k [call k [node + 3 4]])
       (fun of-R
         [call (fun k [call k [node + 1 2]])
        (fun of-L
          [call call_site
                [node + of-L of-R]])])]]
In our model language, functions take only one argument and here we have decided to pass in a representation of the call site first.

When it comes to calling this function, we must pass the continuation first, in a curried manner:

original call site

     

call site after separation of instructions & continuation

(call f 42)

     

[call (call f K0) 42]
(call f 21)

     

[call (call f K1) 21]
And because of this treatment of functions we speak of continuation passing and call the transformation the continuation-passing transformation.

The CPS Transformation

Our investigation by example suggests two, mutually recursive functions for converting an entire program into this continuation=passing style:
  • The first function consumes an expression and produces a function that accepts a continuation. This continuation is the place to which the converted expression must send its result

  • The second function consumes an expression and a variable k that represents a continuation. It splits the expression into a series of atomic instructions that pass their results to continuation and eventually to k.

Figure 70 displays these two functions; figure 71 shows some basic unit tests, with additional hand-transformed terms.

Lectures/17/cps.rkt

  #lang racket
   
  ;; a function that converts the given AssExpr to 
   
  (require "../8/ass-as-data.rkt")
  (require "../define-names.rkt")
  (define-names x y k f a)
  (define-names result-of-a of-f of-right of-left)
  (define-names of-rhs of-fst of-rst of-tst)
  ;; read 'of-xyz' as 'result-of-xyz'
   
  ;; Value is one of:
  ;; -- Integer
  ;; -- Variable because it stands for a value in a CBV PL
  ;; -- [fun Variable AssExpr] because functions are values
   
  ;; AtomicExpr is one of:
  ;; -- Integer
  ;; -- Variable
  ;; -- [node + Variable Variable]
   
  #; {AssExpr -> AssExper}
  (define (cps expr)
    (receive-k expr))
   
  #; {AssExpr -> AssExpr}
  ;; create a funtion that consumes a continuation
  ;; and delivers its result to this continuation 
  (define (receive-k ae)
    (match ae
      [(? integer?)        [fun k (split-expr ae k)]]
      [(? string?)         [fun k (split-expr ae k)]]
      [(node o left right) [fun k (split-expr ae k)]]
      [(fun para body)     [fun k (split-expr ae k)]]
      [(call f a)          [fun k (split-expr ae k)]]
      [(if-0 t thn els)    [fun k (split-expr ae k)]]
      [(set lhs rhs)       [fun k (split-expr ae k)]]
      [(sequ fst rst)      [fun k (split-expr ae k)]]
      [(decl x v body)     [decl x (cps/value v) (receive-k body)]]))
   
  #; {AssExpr Var -> AssExpr}
  ;; split the expression into `atomic' computations until a
  ;; a value can be delivered to the continuation k 
  (define (split-expr ae k)
    (match ae
      [(? integer?) [call k ae]]
      [(? string?)  [call k ae]]
      [(fun para body) (call k (cps/value ae))]
      [(decl x v body) [decl x (cps/value v) (split-expr body k)]]
      [(node o left right)
       [call (receive-k right)
             [fun of-right
                  [call (receive-k left)
                        [fun of-left
                             (call k
                                   (node o of-left of-right))]]]]]
      [(call f a)
       [call (receive-k a)
             [fun result-of-a
                  [call (receive-k f)
                        [fun of-f
                             (call (call of-f k) result-of-a)]]]]]
      [(if-0 tst thn els)
       (call (receive-k tst)
             (fun of-tst
                  (if-0 of-tst
                        (split-expr thn k)
                        (split-expr els k))))]
      [(set x rhs)
       (call (receive-k rhs) [fun of-rhs [call k [set x of-rhs]]])]
      [(sequ fst rst)
       (call (receive-k fst)
             (fun of-fst
                  (call (receive-k rst) (fun of-rst (call k of-rst)))))]))
   
  #; {ValueExpr -> AssExpr}
  ;; convert a value to direct cps 
  (define (cps/value v)
    (match v
      [(? integer?) v]
      [(fun x body) (fun k (fun x (split-expr body k)))]))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide cps)
   
  (provide x y k f a)
  (provide result-of-a of-f of-right of-left)
  (provide of-rhs of-fst of-rst of-tst)
   

Figure 70: The Continuation-Passing-Style Transformation

Lectures/17/cps-tests.rkt

  #lang racket
   
  (require "cps.rkt")
  (require "../8/ass-as-data.rkt")
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  (module+ test ;; sample translations by hand 
    (require rackunit)
   
    (define (cps-tail-call-f a)
      (call (fun k (call k a))
            (fun result-of-a
                 (call (fun k (call k f))
                       (fun of-f
                            (call (call of-f k)
                                  result-of-a))))))
   
    (define cps-call-f-a (fun k (cps-tail-call-f a)))
   
    (define cps-tail-id (fun k (fun x (call k x))))
    (define cps-id (fun k (call k cps-tail-id)))
   
    (define (cps-set nnn)
      (fun k
           [call (fun k (call k nnn))
                 (fun of-rhs
                      (call k (set x of-rhs)))]))
   
    (define cps-set-42 (cps-set 42)))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  (module+ test  ;; tests 
    (check-equal? (cps (decl f (fun x x) [call f 42]))
                  (decl f cps-tail-id (fun k (cps-tail-call-f 42))))
   
    (check-equal? (cps (if-0 1 2 3))
                  (fun k
                       [call [fun k [call k 1]]
                             [fun of-tst
                                  [if-0 of-tst
                                        [call k 2]
                                        [call k 3]]]]))
    
    (check-equal? (cps (sequ (set x 42) (set x 21)))
                  (fun k
                       (call cps-set-42
                             (fun of-fst
                                  (call (cps-set 21)
                                        (fun of-rst
                                             (call k of-rst)))))))
   
    (check-equal? (cps (set x 42)) cps-set-42)
    
    (check-equal? (cps (call f a)) cps-call-f-a)
                  
    (check-equal? (cps (node + f a))
                  (fun k
                       (call (fun k (call k a))
                             (fun of-right
                                  (call (fun k (call k f))
                                        (fun of-left
                                             (call k (node + of-left of-right))))))))
   
    (check-equal? (cps (fun x (fun y x)))
                  (fun k (call k (fun k (fun x (call k (fun k (fun y (call k x)))))))))
    
    (check-equal? (cps (fun x x)) cps-id)
    (check-equal? (cps (fun x 42)) (fun k (call k (fun k (fun x (call k 42)))))))
   

Figure 71: Tests