7 — Errors and Ordering
Tuesday, 28 January 2020
Presenters (1) Art Bacon, Daniel Lawrence, (2) S. Wallace, Z. Wofle
This analysis of interpreters is due to Reynolds, dating back to 1972.
Interpreters can play different roles:
They may exist to run programs.
They may play the role of an executable specification so that compiler writers for the language can explore the behavior of programs.
The creator of the interpreter is then an engineer, who wishes to supply a blue print for implementation workers. The Scheme language, a dialect of Lisp, came with a defining interpreter in several of its standard documents.
They may be a model of a complex programming language that we find in “nature.”
Here the creator is a scientist who is interested in understanding an existing language and put into the simple terms of an interpreter. Think of someone who wishes to understand JavaScript via a simple interpreter so that others can analyze security properties of JavaScript programs.
Especially in the last two cases, an interpreter should bring across the entire meaning of the object language. Otherwise its users might not understand what is going on or might fail to program properly in the object language (case 1).
Leaving anything implicit is thus dangerous. People may overlook the implicit behavior when they study the interpreter. When implementors use a different language to implement and interpreter or a compiler, they may accidentally change the meaning of programs.
One potential leakage we encountered concerns the idea of numbers, specifically integers. The interpreter language may supply big integers by default but the users of the implemented language may wish to use only machine integers—
or vice versa. Lectures/7/int-test.rkt
#lang racket (require "../6/rec-as-data.rkt") (require "interpreter-basic.rkt") (module+ test (require rackunit) (define (! n) (if (zero? n) 1 (* n (! (sub1 n))))) (check-equal? (value-of prog1) (! 10))) An interpreter may also rely on the underlying language to catch errors. After all, it’s just errors.
Our interpreter returns two kinds of values (and one error message): integers and function values. Either one of these two may show up in the “wrong” place:(value-of (decl "f" 10 (call "f" 20))) (value-of (decl "f" (fun "x" "x") (node + "f" 20))) Stop! What would your interpreter produce for these expressions?
Here is a personal lesson that I had to learn via tons of scaring. When systems work, everybody is just happy and nobody cares. When errors show up, people scream. In short, errors matter.
Note From a mathematical perspective, explicitly articulating all error situations leads to a well-defined function (in a non-conventional mathematical sense). It might look like our language of function expressions does not have types, but appearances are deceiving here. We will get back to this when we discuss types. From a programming language perspective, this idea is often referred to as type safety.
When an object language signals errors and sequences of expressions—
Stop! Can you think of a programming language that does not signal errors?— we can use this to find out the order in which the expressions are evaluated. For example, function applications typically evaluate several arguments, and with errors we can typically find out the order in which the interpreter language performs evaluation. If we do have two kinds of errors, we can easily figure out in which order the implementation language deals with function arguments:Stop! What would your interpreter do here?Why could ordering possibly matter?
Let’s repair these leakage problems.
Tests
A programmer who wishes to modify the behavior of some code starts with the formulation of test cases,
Lectures/7/error-tests.rkt
#lang racket (require "../6/rec-as-data.rkt") (require "interpreter-basic.rkt") (module+ test (require rackunit) ;; new tests, making sure errors are signaled (define bad-fun-decl (decl "f" 10 (call "f" 20))) (check-exn #px"function value" (λ () (value-of bad-fun-decl))) (define bad-addition (decl "f" (fun "x" "x") (node + "f" 20))) (check-exn #px"integer" (λ () (value-of bad-addition))))
Figure 27 shows that we wish to receive particular strings when a number shows up in function position or a function value shows up in a numeric operation.
Lectures/7/order-tests.rkt
#lang racket (require "../6/rec-as-data.rkt") (require "interpreter-basic.rkt") (module+ test ;; receive error from right-most expression first: (require rackunit) ;; making sure the order of evaluation is from right to left (define fun-err (call 1 20)) (define one-fun (fun "x" "x")) (define ari-err (node + one-fun 1)) (check-exn #px"closure" (λ () (value-of (node + ari-err fun-err)))) (check-exn #px"closure" (λ () (value-of (call ari-err fun-err)))))
Figure 28 displays two unit tests that ensure the interpreter evaluates sub-expressions from right to left for both primitive arithmetic operations and function calls.
Errors
We eliminate these leakage problems
trying to add a function to numbers
trying to apply a number in function position
trying to reference an undeclared variable
Lectures/7/interpreter-errors.rkt
#lang racket ;; an interpreter that explicitly checks for error conditions but ;; inherits order of evaluation from the implementation language (require "../6/environment.rkt") (require "../6/rec-as-data.rkt") (require "../4/possible-values.rkt") #; {Value = Number || (function-value parameter FExpr Env)} (define UNDECLARED "undeclared variable ~e") (define ARITHMETIC "integer expected, given ~e ") (define CLOSURE "closure expected, given ~e ") #; {FExpr -> Value} ;; determine the value of ae via a substitutione semantics (define (value-of ae0) #; {FExpr Env -> Value} ;; ACCUMULATOR env tracks all declarations between ae and ae0 (define (value-of ae env) (match ae [(? integer?) ae] [(node o a1 a2) (o (number> (value-of a1 env)) (number> (value-of a2 env)))] [(decl x a1 a2) (value-of a2 (add-rec x (λ (env) (value-of a1 env)) env))] [(? string?) (if (defined? ae env) (lookup ae env) (error 'value-of UNDECLARED ae))] [(call ae1 ae2) (fun-apply (function> (value-of ae1 env)) (value-of ae2 env))] [(fun para body) (function-value para body env)] [(if-0 t thn els) (define test-value (value-of t env)) (if (and (number? test-value) (= test-value 0)) (value-of thn env) (value-of els env))])) #; {Value Value -> Value} (define (fun-apply function-representation argument-value) (match function-representation [(function-value fpara fbody env) (value-of fbody (add fpara argument-value env))])) (value-of ae0 empty)) #; {Any -> Number} (define (number> x) (if (integer? x) x (error 'interpreter ARITHMETIC x))) #; {Any -> Function} (define (function> x) (if (function-value? x) x (error 'interpreter CLOSURE x))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (provide value-of) Figure 29: An Interpreter with Explicit Error Situation Checks
The revised interpreter in figure 29 detects this error situations and raises error messages tailored to our model.
Ordering
The idea of taking apart expressions (in your interpreter or compiler) and naming each sub-expression is known as rendering your program in A-normal form. The idea is due to Sabry and yours truly.
Where does our language/interpreter rely on the order of evaluation as determined by the implementation language?
function application
primitive application
To eliminate this dependence, we take apart all expressions and name the pieces. Then we can use an ordinary sequencing construct (e.g. a block) to pick the desired order.
Lectures/7/interpreter-order.rkt
#lang racket ;; an interpreter that explicitly orders the evaluation of ;; every subexpression where the ordering wasn't specified ;; RIGHT to LEFT (require "../6/environment.rkt") (require "../6/rec-as-data.rkt") (require "../4/possible-values.rkt") #; {Value = Number || (function-value parameter FExpr Env)} #; {FExpr -> Value} ;; determine the value of ae via a substitutione semantics (define (value-of ae0) #; {FExpr Env -> Value} ;; ACCUMULATOR env tracks all declarations between ae and ae0 (define (value-of ae env) (match ae [(? integer?) ae] [(node o a1 a2) (define right (number> (value-of a2 env))) (define left (number> (value-of a1 env))) (o left right)] [(decl x a1 a2) (value-of a2 (add-rec x (λ (env) (value-of a1 env)) env))] [(? string?) (if (defined? ae env) (lookup ae env) (error 'value-of "undeclared variable ~e" ae))] [(call ae1 ae2) (define right (value-of ae2 env)) (define left (function> (value-of ae1 env))) (fun-apply left right)] [(fun para body) (function-value para body env)] [(if-0 t thn els) (define test-value (value-of t env)) (if (and (number? test-value) (= test-value 0)) (value-of thn env) (value-of els env))])) #; {Value Value -> Value} (define (fun-apply function-representation argument-value) (match function-representation [(function-value fpara fbody env) (value-of fbody (add fpara argument-value env))])) (value-of ae0 empty)) #; {Any -> Number} (define (number> x) (if (integer? x) x (error 'interpreter "integer expected, given ~e " x))) #; {Any -> Function} (define (function> x) (if (function-value? x) x (error 'interpreter "function-value expected, given ~e " x))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (provide value-of) Figure 30: An Interpreter with Explicitly Orders Evaluations
Figure 30 shows the result of this transformation. This interpreter now passes all the old tests and the new ones.
Machine Limitations
When we design model interpreters such as ours, we ignore hardware limitations. In our cases this choice has two consequences. First, we act as if our language could calculate with all integers not just those that fit into our machine’s memory. Second, we act as if recursive function calls had an infinitely deep stack; in other words, if a recursive computation needs a deep stack to finish, so be it.