17 — CPS
Tuesday, 10 March 2020
Presenters (1) Erin Burba & Joshua Goldman, (2) Emmanuel Manolios & Benjamin Quiring
Status Report
AssExpr = ; — every language has basic constants and operations on them: Int || (node O AssExpr AssExpr) || (if-0 AssExpr AssExpr AssExpr) || ; includes bounded local control ; — languages have lexcially scoped variables: Var || (decl Var AssExpr AssExpr) || ; — languages have higher-order functions and objects (multi-entry functions) (fun Var AssExpr) || ; also models unbounded local control (call AssExpr AssExpr) || ; — and most of the languages you use have mutation by default (set Var AssExpr) || (sequ AssEexpr AssExpr)
visible |
| meaning |
literal constants and ops |
| arithmetic |
expressions |
| value |
scope of a variable binding |
| environment |
assignable variable |
| location |
mutation effects |
| store |
The “convex hull” of these concepts include many you might not immediately recognize. Java has array constants, Python has list constants and operations on them. Some of these operations use mutation. You can model these ideas with the concepts introduced thus far. Combine “arithmetic” and store-allocating constants such as @ with several different slots. Similarly, the notion of a store includes input and output streams. An input stream is a cell that contains a sequence of things from which we can remove the front; an output stream is in a cell that we mutate when we append a new output to the end.
Types are orthogonal to the above space of ideas. They represent claims about computational language constructs, and they impose restrictions. They do not add to the space per se.
The above completely excludes non-local control.
In your homework you have used at least two such computational constructs: signalling a fatal error and raising/catching exceptions. When our interpreter discovers an erroneous situation, say the application of a number to some arguments, the interpreter uses the error signaling mechanism of the implementation language to stop the computation and signal a problem.
Concretely, we don’t even have an explanation for the simplest non-local control operation: stop (the program). You may also wish to understand constructs such as Java’s try/catch exception mechanism, a JavaScript async wait, a Python-style generator, etc. This is what we mean with non-local control.
Approach
In principle, we could change the interpreter. We won’t. See the note at the end. Instead, we will tackle this problem with two different approaches.
The first approach will focus on a source-to-source transformation. That
is, over the next couple of lectures, we will introduce a function that
consumes an AssExpr and produces one—
The second approach will require a switch from interpreters to abstract machines, a slightly different way of modeling and understanding the meaning of computational language constructs.
To this day, JavaScript programmers must employ a manual version of this transformation to implement certain control behaviors. So you should understand the idea well enough that you can implement it yourself.
Every programmer must understand the very idea of a source translation and its limitation.
In conventional PL courses of this kind, we would apply the transformation to the interpreter, just like the store-passing one. But, some of the programming languages you have chosen would make a transformation at the interpreter level rather awkward. So I’d rather not impose this trouble on you.
The goal of the source transformation is to identify the basic (or atomic) instructions in the order in which they are to be run and to re-arrange the term in such a way that the basic instructions are run in this order.
Continuation
The first step is to identify what “instruction” means and how it relates to the rest of a program.
[call (fun of-right ; the continuation (node + (node + 1 2) of-right)) (node + 3 4)] ; the instruction
[call (fun k [call k (node + 3 4)]) ; the first instruction (fun of-right ; its continuation (node + (node + 1 2) of-right))]
[call (fun k [call k [node + 3 4]]) (fun of-right [call (fun k [call k [node + 1 2]]) (fun of-left [node + of-left of-right])])]
A second look also tells us that it now does not matter in what
order our interpreter evaluates these expressions. Every
sub-expression is now either a value—
It is a continuation K0.
[call (fun k [call k [node + 3 4]]) (fun of-right [call (fun k [call k [node + 1 2]]) (fun of-left [call K0 [node + of-left of-right]])])]
Values and Basic Instructions
A value is a term that we can immediately send to a continuation. There is nothing left to compute. Obviously literal constants such as integers are values. So are functions, a point we made with their very introduction into our vocabulary. And we treat variables as values because they stand for values—
even if an assignment statement can change which value a variable stands for. - A basic instruction is the use of a primitive operation or a function call. Thus, for example, we have these basic instructions:(where f and a are variables).
[if-0 [call "f" "a"] "thursday" [call "g" "b"]] [if-0 0 [call "f" "a"] [call "g" "b"]]
expression
what is the instruction
what next, k is the continuation
n : Int
—
(call k n)
x : Var
—
(call k x)
(node o L R)
R
(fun r .. k .. L .. r) (node o L r)
L
(fun l [call k (node o l r)]) (fun x B)
—
(call k (fun x .. B ...))
(call F A)
A
(fun a (call k .. a))
(call F a)
F
(fun f .. f .. a .. k) (seq E1 R)
E1
(fun e1 .. R .. k)
(if-0 T TH EL)
T
(fun t (if-0 t .. TH .. k .. EL .. k))
The table suggests two observations.
First, to make this systematic, we clearly need to recursively deal with sub-expressions inside of continuations, especially those surrounded by dots.
Second, we still have a problem case: functions. More precisely, we don’t know how to convert the inside of functions.
Continuation Passing
While functions are values, their (only) use is to call them eventually and then their body sub-expression is evaluated, and we would like to evaluate a version of the body like the one above. The problem is that functions can be called from many different places. Each place represents a different rest of the computation, that is, a different continuation. All we know is that when the function has completed its computation, the result must be send back to the correct continuation.
[fun x [node + [node 1 2] [node 3 4]]]
|
|
|
Van Wijngaarden suggested this idea as early as September 1964. The fully worked-out continuation-passing idea didn’t get developed until the early 1970s though, and then independently by Fisher, Reynolds, Strachey, and Wadsworth. Steele and Sussman truly put it on the map with implementations in Scheme and for the first Scheme compiler in the late 1970s. This is the same Steele who first specified Java.
[fun call_site [call (fun k [call k [node + 3 4]]) (fun of-R [call (fun k [call k [node + 1 2]]) (fun of-L [call call_site [node + of-L of-R]])])]]
original call site |
| call site after separation of instructions & continuation |
(call f 42) |
| [call (call f K0) 42] |
(call f 21) |
| [call (call f K1) 21] |
The CPS Transformation
The first function consumes an expression and produces a function that accepts a continuation. This continuation is the place to which the converted expression must send its result
The second function consumes an expression and a variable k that represents a continuation. It splits the expression into a series of atomic instructions that pass their results to continuation and eventually to k.
Figure 70 displays these two functions; figure 71 shows some basic unit tests, with additional hand-transformed terms.
Lectures/17/cps.rkt
#lang racket ;; a function that converts the given AssExpr to (require "../8/ass-as-data.rkt") (require "../define-names.rkt") (define-names x y k f a) (define-names result-of-a of-f of-right of-left) (define-names of-rhs of-fst of-rst of-tst) ;; read 'of-xyz' as 'result-of-xyz' ;; Value is one of: ;; -- Integer ;; -- Variable because it stands for a value in a CBV PL ;; -- [fun Variable AssExpr] because functions are values ;; AtomicExpr is one of: ;; -- Integer ;; -- Variable ;; -- [node + Variable Variable] #; {AssExpr -> AssExper} (define (cps expr) (receive-k expr)) #; {AssExpr -> AssExpr} ;; create a funtion that consumes a continuation ;; and delivers its result to this continuation (define (receive-k ae) (match ae [(? integer?) [fun k (split-expr ae k)]] [(? string?) [fun k (split-expr ae k)]] [(node o left right) [fun k (split-expr ae k)]] [(fun para body) [fun k (split-expr ae k)]] [(call f a) [fun k (split-expr ae k)]] [(if-0 t thn els) [fun k (split-expr ae k)]] [(set lhs rhs) [fun k (split-expr ae k)]] [(sequ fst rst) [fun k (split-expr ae k)]] [(decl x v body) [decl x (cps/value v) (receive-k body)]])) #; {AssExpr Var -> AssExpr} ;; split the expression into `atomic' computations until a ;; a value can be delivered to the continuation k (define (split-expr ae k) (match ae [(? integer?) [call k ae]] [(? string?) [call k ae]] [(fun para body) (call k (cps/value ae))] [(decl x v body) [decl x (cps/value v) (split-expr body k)]] [(node o left right) [call (receive-k right) [fun of-right [call (receive-k left) [fun of-left (call k (node o of-left of-right))]]]]] [(call f a) [call (receive-k a) [fun result-of-a [call (receive-k f) [fun of-f (call (call of-f k) result-of-a)]]]]] [(if-0 tst thn els) (call (receive-k tst) (fun of-tst (if-0 of-tst (split-expr thn k) (split-expr els k))))] [(set x rhs) (call (receive-k rhs) [fun of-rhs [call k [set x of-rhs]]])] [(sequ fst rst) (call (receive-k fst) (fun of-fst (call (receive-k rst) (fun of-rst (call k of-rst)))))])) #; {ValueExpr -> AssExpr} ;; convert a value to direct cps (define (cps/value v) (match v [(? integer?) v] [(fun x body) (fun k (fun x (split-expr body k)))])) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (provide cps) (provide x y k f a) (provide result-of-a of-f of-right of-left) (provide of-rhs of-fst of-rst of-tst)
Lectures/17/cps-tests.rkt
#lang racket (require "cps.rkt") (require "../8/ass-as-data.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (module+ test ;; sample translations by hand (require rackunit) (define (cps-tail-call-f a) (call (fun k (call k a)) (fun result-of-a (call (fun k (call k f)) (fun of-f (call (call of-f k) result-of-a)))))) (define cps-call-f-a (fun k (cps-tail-call-f a))) (define cps-tail-id (fun k (fun x (call k x)))) (define cps-id (fun k (call k cps-tail-id))) (define (cps-set nnn) (fun k [call (fun k (call k nnn)) (fun of-rhs (call k (set x of-rhs)))])) (define cps-set-42 (cps-set 42))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (module+ test ;; tests (check-equal? (cps (decl f (fun x x) [call f 42])) (decl f cps-tail-id (fun k (cps-tail-call-f 42)))) (check-equal? (cps (if-0 1 2 3)) (fun k [call [fun k [call k 1]] [fun of-tst [if-0 of-tst [call k 2] [call k 3]]]])) (check-equal? (cps (sequ (set x 42) (set x 21))) (fun k (call cps-set-42 (fun of-fst (call (cps-set 21) (fun of-rst (call k of-rst))))))) (check-equal? (cps (set x 42)) cps-set-42) (check-equal? (cps (call f a)) cps-call-f-a) (check-equal? (cps (node + f a)) (fun k (call (fun k (call k a)) (fun of-right (call (fun k (call k f)) (fun of-left (call k (node + of-left of-right)))))))) (check-equal? (cps (fun x (fun y x))) (fun k (call k (fun k (fun x (call k (fun k (fun y (call k x))))))))) (check-equal? (cps (fun x x)) cps-id) (check-equal? (cps (fun x 42)) (fun k (call k (fun k (fun x (call k 42)))))))