22 — C, CC, CK
Friday, 27 March 2020
Presenters (1) Mason Eldridge & Kia Zafar, (2) Mitchell Gamburg & Da-Jin Chu
The single assignable variable in our stepper state machine contains the code of the program, that is, the “machine instructions.” These instructions control every machine transition. It is therefore known as the control-code.
; print each step of the calculation that reduces ‘expr‘ to a number (define (driver initial) (display " ") (displayln initial) (define *control-code initial) (while (not (final? *control-code)) do (set! *control-code (transition *control-code)) (display " = ") (displayln *control-code)))
Here I use the convention to start the names of registers with *.
The best way of looking at this machine—
Over the course of this lecture we will tease apart how the machine searches for the next instruction by adding another register and then changing the data representation of what we put into this register. In the next lecture, we will add two registers to deal with variable declarations and assignments to variables.
The CC Machine
The CC, CK, CEK, and CESK machines are from my dissertation, where they played the role of proof tools. Since then they have become ubiquitous elements in the PL literature, which is why I teach them here.
stop, if the control code is just a number;
partition the expression into an instruction proper and the evaluation context;
produce the number from a proper instruction;
place the number back into the evaluation context;
go back to step 1.
By separating the arithmetic expression in which the instruction proper can be found from its evaluation context, we can gain insight into how a machine’s workings. We don’t actually need new definitions to write down a machine that keeps these two pieces separate and explicates the search.
Figure 82 displays its complete tabular description. Each state consists of two pieces of data and is therefore represented by two columns. The condition in the fifth column specifies additional constraints on the control code or how to evaluate it. As you read this table, keep in mind that n, n1, and n2 stand for numbers while e1 and e2 stand for arbitrary ArithmeticExpr.
current
current
next
next
Control
Context
Control
Context
if
(e1 op n2)
E
e1
E [(-) op n2]
!(e1 ∈ N)
(e1 op e2)
E
e2
E [e1 op (-)]
!(e2 ∈ N)
(n1 op n2)
E
n
E
n = n1 op n2
n
E [(-) op n1]
(n op n1)
E
n
E [e1 op (-)]
(e1 op n)
E
states a state consists of two pieces of data: an ArithmeticExpr and an evaluation context E
- transitions the five transitions fall into three categories:
with the first two rules, the machine searches for a proper instruction.
If the control register contains an arithmetic expression whose right side is already a number it searches on the left; otherwise on the right.
A search step fills the existing evaluation context with the remainder of the arithmetic expression, that is, one with a hole at the place where the machine searches next.
with the third rule, it executes a proper instruction.
Here we know how to execute a + - * or / instruction on two numbers. When the set of proper instruction becomes more complicated, we may have to add machine instructions.
with the last two rules, the machine “return” the result of a proper instruction to the context.
It removes the partial arithmetic operation that surrounds the hole. This is also a context and can therefore be filled to obtain another expression, and this expression becomes the new control code.
an initial state consists of an expression and an empty context
a final state consists of a number and an empty context
*control
*context
(-)
(84 / 21)
4
return state
(3 - 4)
-1
return state
(-)
(1 + 2)
3
return state
(3 * -1)
(-)
-3
(-)
return state
-3
(-)
Implementing the CC Machine
Figure 83 displays the implementation of the CC machine. Its driver uses two registers: *control and *context. As before, the implementation represents the set of transitions with a function. The five match* clauses precisely correspond to the five rules in the table.
Racket Note Racket’s match* matches two values to two patterns. So the two columns are matched as two patterns. The match* expression returns two values, which go into the two registers next.
The (app split pat) pattern applies split to the context and
then matches the result with pat—
Since splitting an evaluation context for the hole and the surrounding
arithmetic operation is not a simple step—
Lectures/22/cc.rkt
#lang racket (require "../21/while.rkt" "show-state.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - #; {ArithmeticExpr -> Number} ;; print each step of the calculation that reduces `expr` to a number (define (driver initial) (define-values (*control *context) (load initial)) (show-state *control *context) (while (not (final? *control *context)) do (set!-values (*control *context) (transition *control *context)) (show-state *control *context)) (unload *control *context)) #; {ArithmeticExpr E -> ArithmeticExpr E} (define (transition control context) (match* ( control context ) [{ (list (? not-number? ae_1) o (? number? ae_2)) E} (values ae_1 (fill context (list HOLE o ae_2)))] [{ (list ae_1 o (? not-number? ae_2)) E} (values ae_2 (fill context (list ae_1 o HOLE)))] [{ (list (? number? l) o (? number? r)) E} (define n (reduce control)) (show-return-state n context) (values n context)] [{ (? number? n) (app split `((,(? (is? HOLE)) ,o ,ae_2) ,E))} (values (list n o ae_2) E)] [{ (? number? n) (app split `((,ae_1 ,o ,(? (is? HOLE))) ,E))} (values (list ae_1 o n) E)])) #; {E -> E E} (define (split context) (define-values (inside E) (let return ([context context]) (match context [(list (? (is? HOLE)) o ae_2) (values context HOLE)] [(list ae_1 o (? (is? HOLE))) (values context HOLE)] [(list ae_1 o (? number? k)) (define-values (ae E) (return ae_1)) (values ae (list E o k))] [(list ae_1 o ae_2) (define-values (ae E) (return ae_2)) (values ae (list ae_1 o E))]))) (list inside E)) #; {E ArithemticExpr -> ArithemticExpr} (define (fill context inside) (match context [(? (is? HOLE)) inside] [(list E o (? number? n)) (list (fill E inside) o n)] [(list ae_1 o E) (list ae_1 o (fill E inside))])) #; {ArithmeticExpr E -> ArithmeticExpr E} (define (load ae) (values ae HOLE)) #; {ArithmeticExpr E -> Number} (define (unload n _) n) #; {ArithmeticExpr E -> Boolean} (define (final? control context) (and (number? control) (equal? HOLE context))) #; {(list Number o Number) -> Number} ;; okay this is a trick, but almost every language has this trick (define ns (make-base-namespace)) (define (reduce ae) ( (eval (second ae) ns) (first ae) (third ae))) (define not-number? (compose not number?)) (define HOLE '[---]) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (module+ test (require rackunit) (check-equal? (driver '((1 + 2) * (3 - (84 / 21)))) -3))
One other small detail distinguishes this implementation from the C machine. The machine itself accepts an ArithmeticExpr but runs a two-register configuration. To bridge the gap, we use a load function. When the machine reaches a final state, we unload the registers to obtain the final result for the driver.
Note When you step through the machine’s execution, you notice that it
is in a “return” state twice in a row. The first produces the value, the
second one actually returns it to the surrounding context. An implementation
could merge these two steps—
The CK Machine
When the CC machine goes into a return state, it must split the evaluation
context into the innermost arithmetic operation—
In short, the CC machine treats the evaluation context as a “last-in,
first-out” data structure—
The representation as a context—
(L op n) which denotes this context ((--) op n)
(R op e) which stands for (ae op (--))
Figure 84 presents the tabular description of the revised machine.
current
current
next
next
Control
StacK
Control
Stack
if
(e1 o n2)
e1
K, (L o n2)
!(e1 ∈ N)
(e1 o e2)
e2
K, (R o e1)
!(e2 ∈ N)
(n1 o n2)
n
n = n1 o n2
n
K, (L o n1)
(n o n1)
n
K, (R o e1)
(e1 o n)
Implementing the CK Machine
The implementation of the CK machine relies on the existence of a stack data structure. Sadly most existing programming languages co-mingle the idea of creating and using a stack with mutating a data structure directly. When we use them for a state machine, however, we wish to separate the register assignment from the operations on stacks as data. In short we follow Josh Bloch’s advice for Java programmers and “favor immutability.” This permits a through testing of the stack data structure independently of changing a variable that contains one.
; Frame = [list 'right op ArithmeticExpr] ; || [list 'left op Number] ; ; type Stack[Frame] (provide ; Stack mt ; Stack[X] X -> Stack[X] push ; Stack[X] -> (list X Stack[X]) pop)
mt is the empty stack constant
push adds an X to a K
pop simultaneously returns the last X added and the remainder of the stack.
With this kind of stack, the CC machine can easily be turned into a CK machine, and the result can be seen in figure 86.
Lectures/22/ck.rkt
#lang racket (require "../21/while.rkt" "stack.rkt" "show-stack-state.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (define LEFT "left") (define RGHT "right") #; {ArithmeticExpr -> Number} ;; print each step of the calculation that reduces `expr` to a number (define (driver initial) (define-values (*control *stack) (load initial)) (show-state *control *stack) (while (not (final? *control *stack)) do (set!-values (*control *stack) (transition *control *stack)) (show-state *control *stack)) (unload *control *stack)) #; {ArithmeticExpr Stack -> ArithmeticExpr Stack} (define (transition control stack) (match* { control stack } [{ (list(? not-number? ae_1) o (? number? ae_2)) stack} (values ae_1 (push stack (list LEFT o ae_2)))] [{ (list ae_1 o (? not-number? ae_2)) stack } (values ae_2 (push stack (list RGHT o ae_1)))] [{ (list (? number? l) o (? number? r)) stack } (define n (reduce control)) (show-return-state n stack) (values n stack)] [{ (? number? n) (app pop `((,(? (is? LEFT)) ,o ,ae_2) ,S)) } (values (list n o ae_2) S)] [{ (? number? n) (app pop `((,(? (is? RGHT)) ,o ,ae_1) ,S)) } (values (list ae_1 o n) S)])) #; {ArithmeticExpr -> ArithmeticExpr Stack} (define (load ae) (values ae mt)) #; {ArithmeticExpr Stack -> Number} (define (unload n _) n) #; {ArithmeticExpr Stack -> Boolean} (define (final? control stack) (and (number? control) (equal? mt stack))) #; {(list Number o Number) -> Number} ;; okay this is a trick, but almost every language has this trick (define ns (make-base-namespace)) (define (reduce ae) ( (eval (second ae) ns) (first ae) (third ae))) (define not-number? (compose not number?)) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (module+ test (require rackunit) (check-equal? (driver '((1 + 2) * (3 - (84 / 21)))) -3))
The transition function still distinguishes five cases with a match* and returns the two new values: a Control and a K. The key difference concerns the “return state”. Instead of using the recursive traversal function of the CC machine, the CK machine uses just pop to extract the top frame and to use it for the construction of the next control code. As you can see from figure 86, the CK implementation is significantly more concise than the CC implementation.