7.7.0.3

12 — The Truth

Friday, 14 February 2020

Presenters (1) Nola Chen, Spencer Pozder (2) William Epstein, Lucas McCanna

Type Soundness

For a programmer, the truth is what the program does. What it computes. Now that you understand some basics of PL, you know that this means "what the interpreter does." And you also know about the idea that the interpreter has a dual existence: as a real piece of software that manipulates bits and a mathematical function, a Platonic idea.

The idea of type soundness is due Robin Milner.

In this spirit, we can state two major mathematical theorems, and they apply to the real world.

Theorem The interpretation of every TypedISL expression terminates.

Stop! What does this mean? Didn’t we say that this language of functions and applications is as powerful as any language? This theorem has major implications for the working developer.

Theorem (v1) The interpretation of every TypedISL expression of type (int)
  • evaluates to an integer

Stop! Does this mean we never have to check whether the result of a sub-expression of tnode is an integer? We can always apply + and * without fear? Does this also mean that in [tcall f a] the expression f always evaluates to a closure?

The common proof of type soundness is due to yours truly in collaboration with Wright.

Indeed, the known ways to prove this theorem show that all expressions of type [-> t s] for types t and s evaluate to a closure.

Hence, this theorem has major implications for the working programming language implementer.

Primitives: A Type Prelude, Partiality

Our language design, especially the one for your homework, treats functions and built-in primitive operations uniformly.

Lectures/12/example-env0.rkt

  #lang racket
   
  (require "rec-type-check.rkt")
  (require "isl-as-data.rkt")
  (require "../define-names.rkt")
   
  (define-names f x)
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
  (define +-example
    (tcall (tcall "+" 1) 1))
   
  (type-check +-example)
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
  (define ^-example
    (tcall (tcall "^" 1) -1))
   
  (type-check ^-example)
   
   

Figure 52: Typing Primitives as Functions

So, while we may wish to write down infix expressions for + expressions, a well-trained developer should understand that + is a function and can be applied to arguments just like a function. The examples in figure 52 shows what this may look like.

Lectures/12/type-env0.rkt

  #lang racket
   
  (require "../11/types.rkt")
  (require "../6/environment.rkt")
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
  (define population
    (list (list "+" (-> (int) (-> (int) (int))))
          (list "*" (-> (int) (-> (int) (int))))
          (list "^" (-> (int) (-> (int) (int))))))
   
  (define TYPE-ENV0
    (for/fold ([env empty]) ([name-type population])
      (add (first name-type) (second name-type) env)))
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide TYPE-ENV0)
   

Figure 53: A Base Type Environment

They also indicate that we should change the type checker so that it uses a pre-populated type environment. Naturally the environment should describe the run-time environment properly. See figure 53 for how we accomplish this here.

Note that population in figure 53 includes a type for the exponentiation operator from 4 — Interpreting Functions. Although we know that the operator cannot accept negative exponent arguments, our type system cannot express this idea. So population uses (int) as the type for the second argument.

Stop! But now bad things can happen at run-time, no?

If the type system cannot express an idea, our type-soundness theorem must state the exception and our interpreter must deal with it.

Theorem (v2) The interpretation of every TypedISL expression of type (int) in a run-time base environment that agrees with the type-phase base environment (including exponentation)
  • evaluates to an integer

  • signals the error
    "integer exponentiation is defined for non-negative exponents only"

Take a look at the two italicized phrases in this theorem. These additions to the previous theorem are necessary to bring the type system closer to the ones you know from working in soundly typed languages such as C#, Java, or Rust.

Recursive Declarations

Let’s restore some basic power to our programming language: the ability to go into an infinite loop. This may not sound like some truly useful power, but the point is, right now we can’t even write down the factorial function because the language lacks recursion.

Here is how we deal with recursive declarations, in terms of rules:

  

   TEnv + (f,t) |- rhs : t,   TEnv + (f,t) |- bdy : s

  ----------------------------------------------------

              TEnv |- [decl f t rhs bdy] : s

  

In words, the rule says that a decl is type checked in three steps:
  • the right-hand side (rhs) of the declaration is checked in an environment that already contains the type for the newly declared name

  • if the typing rules validate that the right-hand side has a type, it must the same type as the one that is declared

  • the body (bdy) of the declaration is also checked in an environment that already contains the type for the newly declared name

Lectures/12/example-rec-fun.rkt

  #lang racket
   
  (require "rec-type-check.rkt")
  (require "isl-as-data.rkt")
  (require "../11/types.rkt")
  (require "../define-names.rkt")
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
  (define-names f x _)
   
  (define rec-example
    (tdecl f (-> (int) (int))
           (tfun* x (int) (tcall f x))
           (tcall f 42)))
   
  (type-check rec-example)
   
  (define rec-example-2
    (tdecl f (-> (int) (int))
           (tcall (tfun* _ (int) (tfun* x (int) (tcall f x)))
                  (tcall f 0)) ;; <<< call f before eval'ed
           (tcall f 42)))
   
  (type-check rec-example-2)
   

Figure 54: Recursive Functions

Of course, adding a feature to the language may have implications for how we state the type-soundness theorem. The addition of recursive functions means that the interpretation of a program may not even produce a final value. And that means, we need to add one more—the final—clauses to the theorem statement.

Theorem (v3) The interpretation of every TypedISL expression, including tdecl expressions, of type (int) in a run-time base environment that agrees with the type-phase base environment
  • evaluates to an integer

  • signals the error
    "integer exponentiation is defined for non-negative exponents only"

  • goes into an infinite loop

The Revised Type Checker

Figure 55 shows how to translate this rule into code for the revised and extended type checker.

Lectures/12/rec-type-check.rkt

  #lang racket
   
  (require "type-env0.rkt")
  (require "isl-as-data.rkt")
  (require "../11/types.rkt")
  (require "../6/environment.rkt")
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (define UNDECLARED "undeclared variable")
  (define ARITHMETIC "bad types for prim op")
  (define FUNCTION   "function type expected")
  (define DOMAIN     "domain type doesn't match arg type")
  (define DECL       "type of a declaration doesn't match")
  (define IF-TEST    "int expected in if test position")
  (define IF-BRANCH  "same types expected in branches of if")
   
  #; {TypedISL -> Type}
  (define (type-check isl0)
   
    #; {TypedISL TEnv -> Type}
    (define (type-check/accu isl env)
      (match isl
        [(? string?)   (if (defined? isl env)
                           (lookup isl env)
                           (error 'tc UNDECLARED))]
        [(? integer?)  (int)]
        [(tfun* x t b) (define env+ (add x t env))
                       (define tbdy (type-check/accu b env+))
                       (-> t tbdy)]
        [(tcall f a)   (define tf (type-check/accu f env))
                       (define ta (type-check/accu a env))
                       (cond
                         [(not (->? tf))
                          (error 'tc FUNCTION)]
                         [(not (type=? (->-domain tf) ta))
                          (error 'tc DOMAIN)]
                         [else (->-range tf)])]
   
        ;; - - - - - - - - - - - - - - - - - - - - - - - -
        ;; RECURSION 
        [(tdecl f type rhs body)
         (define env+ (add f type env))
         (define trhs (type-check/accu rhs env+))
         (if (type=? trhs type)
             ;; specification ___agrees___ with the computed type:
             (type-check/accu body env+)
             ;; specification ___disagrees___ with the computed type:
             (error 'tc DECL))]
   
        ;; - - - - - - - - - - - - - - - - - - - - - - - -
        ;; HOMEWORK
        [(tif-0 tst thn els)
         (define ttst (type-check/accu tst env))
         (cond
           [(not (int? ttst)) (error 'tc IF-TEST)]
           [else (define tthn (type-check/accu thn env))
                 (define tels (type-check/accu els env))
                 (if (type=? tthn tels)
                     tthn
                     (error 'tc IF-BRANCH))])]))
   
    (type-check/accu isl0 TYPE-ENV0))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
  (provide type-check)
   

Figure 55: Recursion

The Revised Interpreter

What does it mean when the theorems above list the premise “a run-time base environment that agrees with the type-phase base environment?” Let’s call the type-phase base environment TEnv0 and the run-time base environment ENV0. Both associate variables with a Racket value: TEnv0 with types, and ENV0 with integers and function-values. So here is what this sentence means then:
  • if TEnv0 associates variable x with a type T, then ENV0 associates x with a value V, and vice versa.

  • if T is (int), then V is an integer.

  • if T is (-> s u), then V is primop that is defined on all values of type s or signals an error E. Every such error E must be listed in the soundness theorem.

Figure 56 displays a run-time environment that is compatible with the type-phase environment of figure 53.

Lectures/12/env0.rkt

  #lang racket
   
  (require "../6/environment.rkt")
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
  (define (^ y x)
    (if (and (exact-integer? x) (>= x 0))
        (expt y x)
        (error 'interpreter "^ is partial")))
   
  (define population
    (list (list "+" (λ (x) (λ (y) (+ x y))))
          (list "*" (curry *)) 
          (list "^" (curry ^))))
   
  (define ENV0
    (for/fold ([env empty]) ([name-type population])
      (add (first name-type) (second name-type) env)))
  ;; - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide ENV0)
   

Figure 56: A Runtime Base Environment, Compatible with Type Base Environment

Stop! Why are the two environments compatible? What is the only error E that one of these primitive operations may signal?

Stop again! Add a function-value to ENV0 that consumes a function on integers and an integer. It applies the function to the given integer and adds 1 to the result. Now develop a compatible type.

And now we can revise the interpreter so that it expresses what the type soundness theorem implies. Figure 57 shows an interpreter without any run-time checks in the inner “loop.” Indeed, by removing the node clause completely, the only check that might exist is the one for “functionness” in the call clause. But, the type system guarantees that we always get a value that we can apply and therefore this check is gone.

Lectures/12/interpreter.rkt

  #lang racket
   
  ;; an interpreter that does not check any error conditions
  ;; at run-time
   
  (require "rec-type-check.rkt")
  (require "env0.rkt")
  (require "strip-types.rkt")
  (require "../6/environment.rkt")
  (require "../6/rec-as-data.rkt")
  (require "../4/possible-values.rkt")
   
  #; {Value = Number || (function-value parameter FExpr Env)}
   
  #; {FExpr -> Value}
  ;; determine the value of ae via a substitutione semantics 
  (define (interpreter ae0)
   
    #; {FExpr Env -> Value}
    ;; ACCUMULATOR env tracks all declarations between ae and ae0
    (define (i/accu ae env)
      (match ae
        [(? integer?)
         ae]
        [(decl x a1 a2)
         (i/accu a2 (add-rec x (λ (env) (i/accu a1 env)) env))]
        [(? string?) (lookup ae env)]
        [(call ae1 ae2)
         (fun-apply (i/accu ae1 env) (i/accu ae2 env))]
        [(fun para body)
         (function-value para body env)]
        [(if-0 t thn els)
         (define test-value (i/accu t env))
         (if (= test-value 0) (i/accu thn env) (i/accu els env))]))
   
    #; {Value Value -> Value}
    (define (fun-apply function-representation argument-value)
      (match function-representation
        [(function-value fpara fbody env)
         (i/accu fbody (add fpara argument-value env))]
        [op (op argument-value)]))
   
    ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    (define untyped-ae0
      (with-handlers ([exn:fail? exn-message])
        (define ___typed (type-check ae0))
        (strip-types ae0)))
   
    (if (string? untyped-ae0) untyped-ae0 (i/accu untyped-ae0 ENV0)))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide interpreter)
   

Figure 57: An Interpreter for a Soundly Typed Language

Figure 53 also benefits from the absence of any checks in the primitive functions—except for the one that articulates the partial nature of our exponenation operator, which is precisely the only error listed in the theorem.

Stripping Types

The interpreter works on a type-free abstract syntax tree. Figure 58 displays the function that strips types from the source representation.

Lectures/12/strip-types.rkt

  #lang racket
   
  (require "isl-as-data.rkt")
  (require "../6/rec-as-data.rkt")
   
  #; {TISL -> RecExpr}
  (define (strip-types isl)
    (match isl
      [(? string?)   isl]
      [(? integer?)  isl]
      [(tfun* x t b)
       (fun x (strip-types b))]
      [(tcall f a)   
       (call (strip-types f) (strip-types a))]
      [(tdecl f t rhs b)
       (decl f (strip-types rhs) (strip-types b))]
      [(tif-0 tst thn els)
       (if-0 (strip-types tst)
             (strip-types thn)
             (strip-types els))]))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide strip-types)
   

Figure 58: Translating from A Typed to an Untyped Language

In the mid 1990s, Harper, Lee, and Leroy launched a systematic research program to show that the compiler can rely on and exploit logical types if they are preserved across compilation phases. Northeastern’s very own Prof. Ahmed works in this tradition.

The strip-types function represents another deep programming-language idea:

The type-checking phase is completely separate from the interpretation phase.

Or in other words, logical types do not affect the behavior of programs.