7.7.0.3

7 — Errors and Ordering

Tuesday, 28 January 2020

Presenters (1) Art Bacon, Daniel Lawrence, (2) S. Wallace, Z. Wofle

This analysis of interpreters is due to Reynolds, dating back to 1972.

Interpreters can play different roles:

Especially in the last two cases, an interpreter should bring across the entire meaning of the object language. Otherwise its users might not understand what is going on or might fail to program properly in the object language (case 1).

Leaving anything implicit is thus dangerous. People may overlook the implicit behavior when they study the interpreter. When implementors use a different language to implement and interpreter or a compiler, they may accidentally change the meaning of programs.

Interpreters may accidentally expose their underlying language in many ways:
  • One potential leakage we encountered concerns the idea of numbers, specifically integers. The interpreter language may supply big integers by default but the users of the implemented language may wish to use only machine integers—or vice versa.

    Lectures/7/int-test.rkt

      #lang racket
       
      (require "../6/rec-as-data.rkt")
      (require "interpreter-basic.rkt")
       
      (module+ test
        (require rackunit)
       
        (define (! n) (if (zero? n) 1 (* n (! (sub1 n)))))
        (check-equal? (value-of prog1) (! 10)))
       

    Figure 26: Which Integers

  • An interpreter may also rely on the underlying language to catch errors. After all, it’s just errors.

    Our interpreter returns two kinds of values (and one error message): integers and function values. Either one of these two may show up in the “wrong” place:
    (value-of (decl "f" 10 (call "f" 20)))
     
    (value-of (decl "f" (fun "x" "x") (node + "f" 20)))

    Stop! What would your interpreter produce for these expressions?

    Here is a personal lesson that I had to learn via tons of scaring. When systems work, everybody is just happy and nobody cares. When errors show up, people scream. In short, errors matter.

    Note From a mathematical perspective, explicitly articulating all error situations leads to a well-defined function (in a non-conventional mathematical sense). It might look like our language of function expressions does not have types, but appearances are deceiving here. We will get back to this when we discuss types. From a programming language perspective, this idea is often referred to as type safety.

  • When an object language signals errors and sequences of expressions—Stop! Can you think of a programming language that does not signal errors?—we can use this to find out the order in which the expressions are evaluated. For example, function applications typically evaluate several arguments, and with errors we can typically find out the order in which the interpreter language performs evaluation.

    If we do have two kinds of errors, we can easily figure out in which order the implementation language deals with function arguments:
    (value-of
     (node *
           (decl "f" (fun "x" "x") (node + "f" 20))
           (decl "f" 10 (call "f" 20))))
    Stop! What would your interpreter do here?

    Why could ordering possibly matter?

Let’s repair these leakage problems.

Tests

A programmer who wishes to modify the behavior of some code starts with the formulation of test cases,

Lectures/7/error-tests.rkt

  #lang racket
   
  (require "../6/rec-as-data.rkt")
  (require "interpreter-basic.rkt")
   
  (module+ test
    (require rackunit)
   
    ;; new tests, making sure errors are signaled 
    (define bad-fun-decl (decl "f" 10 (call "f" 20)))
    (check-exn #px"function value" (λ () (value-of bad-fun-decl)))
   
    (define bad-addition (decl "f" (fun "x" "x") (node + "f" 20)))
    (check-exn #px"integer" (λ () (value-of bad-addition))))
   
   
   
   

Figure 27: Run-Time Errors

Figure 27 shows that we wish to receive particular strings when a number shows up in function position or a function value shows up in a numeric operation.

Lectures/7/order-tests.rkt

  #lang racket
   
  (require "../6/rec-as-data.rkt")
  (require "interpreter-basic.rkt")
   
  (module+ test ;; receive error from right-most expression first:
    (require rackunit)
   
    ;; making sure the order of evaluation is from right to left
    (define fun-err (call 1 20))
    (define one-fun (fun "x" "x"))
    (define ari-err (node + one-fun 1))
   
    (check-exn #px"closure" (λ () (value-of (node + ari-err fun-err))))
    (check-exn #px"closure" (λ () (value-of (call ari-err fun-err)))))
   

Figure 28: Which Order

Figure 28 displays two unit tests that ensure the interpreter evaluates sub-expressions from right to left for both primitive arithmetic operations and function calls.

Errors

We eliminate these leakage problems

Recall that our interpreter may encounter three kinds of errors:
  • trying to add a function to numbers

  • trying to apply a number in function position

  • trying to reference an undeclared variable

It already checks for the presence of undeclared variables.

Lectures/7/interpreter-errors.rkt

  #lang racket
   
  ;; an interpreter that explicitly checks for error conditions but 
  ;; inherits order of evaluation from the implementation language 
   
  (require "../6/environment.rkt")
  (require "../6/rec-as-data.rkt")
  (require "../4/possible-values.rkt")
   
  #; {Value = Number || (function-value parameter FExpr Env)}
   
  (define UNDECLARED  "undeclared variable ~e")
  (define ARITHMETIC  "integer expected, given ~e ")
  (define CLOSURE     "closure expected, given ~e ")
   
  #; {FExpr -> Value}
  ;; determine the value of ae via a substitutione semantics 
  (define (value-of ae0)
   
    #; {FExpr Env -> Value}
    ;; ACCUMULATOR env tracks all declarations between ae and ae0
    (define (value-of ae env)
      (match ae
        [(? integer?)
         ae]
        [(node o a1 a2)
         (o (number> (value-of a1 env)) (number> (value-of a2 env)))]
        [(decl x a1 a2)
         (value-of a2 (add-rec x (λ (env) (value-of a1 env)) env))]
        [(? string?)
         (if (defined? ae env)
      (lookup ae env)
      (error 'value-of UNDECLARED ae))]
        [(call ae1 ae2)
         (fun-apply (function> (value-of ae1 env)) (value-of ae2 env))]
        [(fun para body)
         (function-value para body env)]
        [(if-0 t thn els)
         (define test-value (value-of t env))
         (if (and (number? test-value) (= test-value 0))
             (value-of thn env)
             (value-of els env))]))
   
    #; {Value Value -> Value}
    (define (fun-apply function-representation argument-value)
      (match function-representation
        [(function-value fpara fbody env)
         (value-of fbody (add fpara argument-value env))]))
   
    (value-of ae0 empty))
   
  #; {Any -> Number}
  (define (number> x)
    (if (integer? x)
        x
        (error 'interpreter ARITHMETIC x)))
   
  #; {Any -> Function}
  (define (function> x)
    (if (function-value? x)
        x
        (error 'interpreter CLOSURE x)))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide value-of)
   

Figure 29: An Interpreter with Explicit Error Situation Checks

The revised interpreter in figure 29 detects this error situations and raises error messages tailored to our model.

Ordering

The idea of taking apart expressions (in your interpreter or compiler) and naming each sub-expression is known as rendering your program in A-normal form. The idea is due to Sabry and yours truly.

Where does our language/interpreter rely on the order of evaluation as determined by the implementation language?

To eliminate this dependence, we take apart all expressions and name the pieces. Then we can use an ordinary sequencing construct (e.g. a block) to pick the desired order.

Lectures/7/interpreter-order.rkt

  #lang racket
   
  ;; an interpreter that explicitly orders the evaluation of
  ;; every subexpression where the ordering wasn't specified
  ;; RIGHT to LEFT 
   
  (require "../6/environment.rkt")
  (require "../6/rec-as-data.rkt")
  (require "../4/possible-values.rkt")
   
  #; {Value = Number || (function-value parameter FExpr Env)}
   
  #; {FExpr -> Value}
  ;; determine the value of ae via a substitutione semantics 
  (define (value-of ae0)
   
    #; {FExpr Env -> Value}
    ;; ACCUMULATOR env tracks all declarations between ae and ae0
    (define (value-of ae env)
      (match ae
        [(? integer?)
         ae]
        [(node o a1 a2)
         (define right (number> (value-of a2 env)))
         (define left  (number> (value-of a1 env)))
         (o left right)]
        [(decl x a1 a2)
         (value-of a2 (add-rec x (λ (env) (value-of a1 env)) env))]
        [(? string?)
         (if (defined? ae env)
      (lookup ae env)
      (error 'value-of "undeclared variable ~e" ae))]
        [(call ae1 ae2)
         (define right (value-of ae2 env))
         (define left  (function> (value-of ae1 env)))
         (fun-apply left right)]
        [(fun para body)
         (function-value para body env)]
        [(if-0 t thn els)
         (define test-value (value-of t env))
         (if (and (number? test-value) (= test-value 0))
             (value-of thn env)
             (value-of els env))]))
   
    #; {Value Value -> Value}
    (define (fun-apply function-representation argument-value)
      (match function-representation
        [(function-value fpara fbody env)
         (value-of fbody (add fpara argument-value env))]))
   
    (value-of ae0 empty))
   
  #; {Any -> Number}
  (define (number> x)
    (if (integer? x)
        x
        (error 'interpreter "integer expected, given ~e " x)))
   
  #; {Any -> Function}
  (define (function> x)
    (if (function-value? x)
        x
        (error 'interpreter "function-value expected, given ~e " x)))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide value-of)
   

Figure 30: An Interpreter with Explicitly Orders Evaluations

Figure 30 shows the result of this transformation. This interpreter now passes all the old tests and the new ones.

Machine Limitations

When we design model interpreters such as ours, we ignore hardware limitations. In our cases this choice has two consequences. First, we act as if our language could calculate with all integers not just those that fit into our machine’s memory. Second, we act as if recursive function calls had an infinitely deep stack; in other words, if a recursive computation needs a deep stack to finish, so be it.