12 — The Truth
Friday, 14 February 2020
Presenters (1) Nola Chen, Spencer Pozder (2) William Epstein, Lucas McCanna
Type Soundness
For a programmer, the truth is what the program does. What it computes. Now that you understand some basics of PL, you know that this means "what the interpreter does." And you also know about the idea that the interpreter has a dual existence: as a real piece of software that manipulates bits and a mathematical function, a Platonic idea.
The idea of type soundness is due Robin Milner.
In this spirit, we can state two major mathematical theorems, and they apply to the real world.
Theorem The interpretation of every TypedISL expression terminates.
Stop! What does this mean? Didn’t we say that this language of functions and applications is as powerful as any language? This theorem has major implications for the working developer.
evaluates to an integer
Stop! Does this mean we never have to check whether the result of a sub-expression of tnode is an integer? We can always apply + and * without fear? Does this also mean that in [tcall f a] the expression f always evaluates to a closure?
The common proof of type soundness is due to yours truly in collaboration with Wright.
Indeed, the known ways to prove this theorem show that all expressions of type [-> t s] for types t and s evaluate to a closure.
Hence, this theorem has major implications for the working programming language implementer.
Primitives: A Type Prelude, Partiality
Our language design, especially the one for your homework, treats functions and built-in primitive operations uniformly.
Lectures/12/example-env0.rkt
#lang racket (require "rec-type-check.rkt") (require "isl-as-data.rkt") (require "../define-names.rkt") (define-names f x) ;; - - - - - - - - - - - - - - - - - - - - - - - - - (define +-example (tcall (tcall "+" 1) 1)) (type-check +-example) ;; - - - - - - - - - - - - - - - - - - - - - - - - - (define ^-example (tcall (tcall "^" 1) -1)) (type-check ^-example)
So, while we may wish to write down infix expressions for + expressions, a well-trained developer should understand that + is a function and can be applied to arguments just like a function. The examples in figure 52 shows what this may look like.
Lectures/12/type-env0.rkt
#lang racket (require "../11/types.rkt") (require "../6/environment.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - (define population (list (list "+" (-> (int) (-> (int) (int)))) (list "*" (-> (int) (-> (int) (int)))) (list "^" (-> (int) (-> (int) (int)))))) (define TYPE-ENV0 (for/fold ([env empty]) ([name-type population]) (add (first name-type) (second name-type) env))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - (provide TYPE-ENV0)
They also indicate that we should change the type checker so that it uses a pre-populated type environment. Naturally the environment should describe the run-time environment properly. See figure 53 for how we accomplish this here.
Note that population in figure 53 includes a type for
the exponentiation operator from 4 —
Stop! But now bad things can happen at run-time, no?
If the type system cannot express an idea, our type-soundness theorem must state the exception and our interpreter must deal with it.
evaluates to an integer
- signals the error"integer exponentiation is defined for non-negative exponents only"
Take a look at the two italicized phrases in this theorem. These additions to the previous theorem are necessary to bring the type system closer to the ones you know from working in soundly typed languages such as C#, Java, or Rust.
Recursive Declarations
Let’s restore some basic power to our programming language: the ability to go into an infinite loop. This may not sound like some truly useful power, but the point is, right now we can’t even write down the factorial function because the language lacks recursion.
|
TEnv + (f,t) |- rhs : t, TEnv + (f,t) |- bdy : s |
---------------------------------------------------- |
TEnv |- [decl f t rhs bdy] : s |
|
the right-hand side (rhs) of the declaration is checked in an environment that already contains the type for the newly declared name
if the typing rules validate that the right-hand side has a type, it must the same type as the one that is declared
the body (bdy) of the declaration is also checked in an environment that already contains the type for the newly declared name
Lectures/12/example-rec-fun.rkt
#lang racket (require "rec-type-check.rkt") (require "isl-as-data.rkt") (require "../11/types.rkt") (require "../define-names.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - (define-names f x _) (define rec-example (tdecl f (-> (int) (int)) (tfun* x (int) (tcall f x)) (tcall f 42))) (type-check rec-example) (define rec-example-2 (tdecl f (-> (int) (int)) (tcall (tfun* _ (int) (tfun* x (int) (tcall f x))) (tcall f 0)) ;; <<< call f before eval'ed (tcall f 42))) (type-check rec-example-2)
Of course, adding a feature to the language may have implications for how
we state the type-soundness theorem. The addition of recursive functions
means that the interpretation of a program may not even produce a final
value. And that means, we need to add one more—
evaluates to an integer
- signals the error"integer exponentiation is defined for non-negative exponents only"
goes into an infinite loop
The Revised Type Checker
Figure 55 shows how to translate this rule into code for the revised and extended type checker.
Lectures/12/rec-type-check.rkt
#lang racket (require "type-env0.rkt") (require "isl-as-data.rkt") (require "../11/types.rkt") (require "../6/environment.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - (define UNDECLARED "undeclared variable") (define ARITHMETIC "bad types for prim op") (define FUNCTION "function type expected") (define DOMAIN "domain type doesn't match arg type") (define DECL "type of a declaration doesn't match") (define IF-TEST "int expected in if test position") (define IF-BRANCH "same types expected in branches of if") #; {TypedISL -> Type} (define (type-check isl0) #; {TypedISL TEnv -> Type} (define (type-check/accu isl env) (match isl [(? string?) (if (defined? isl env) (lookup isl env) (error 'tc UNDECLARED))] [(? integer?) (int)] [(tfun* x t b) (define env+ (add x t env)) (define tbdy (type-check/accu b env+)) (-> t tbdy)] [(tcall f a) (define tf (type-check/accu f env)) (define ta (type-check/accu a env)) (cond [(not (->? tf)) (error 'tc FUNCTION)] [(not (type=? (->-domain tf) ta)) (error 'tc DOMAIN)] [else (->-range tf)])] ;; - - - - - - - - - - - - - - - - - - - - - - - - ;; RECURSION [(tdecl f type rhs body) (define env+ (add f type env)) (define trhs (type-check/accu rhs env+)) (if (type=? trhs type) ;; specification ___agrees___ with the computed type: (type-check/accu body env+) ;; specification ___disagrees___ with the computed type: (error 'tc DECL))] ;; - - - - - - - - - - - - - - - - - - - - - - - - ;; HOMEWORK [(tif-0 tst thn els) (define ttst (type-check/accu tst env)) (cond [(not (int? ttst)) (error 'tc IF-TEST)] [else (define tthn (type-check/accu thn env)) (define tels (type-check/accu els env)) (if (type=? tthn tels) tthn (error 'tc IF-BRANCH))])])) (type-check/accu isl0 TYPE-ENV0)) ;; - - - - - - - - - - - - - - - - - - - - - - - - - (provide type-check)
The Revised Interpreter
if TEnv0 associates variable x with a type T, then ENV0 associates x with a value V, and vice versa.
if T is (int), then V is an integer.
if T is (-> s u), then V is primop that is defined on all values of type s or signals an error E. Every such error E must be listed in the soundness theorem.
Figure 56 displays a run-time environment that is compatible with the type-phase environment of figure 53.
Lectures/12/env0.rkt
#lang racket (require "../6/environment.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - (define (^ y x) (if (and (exact-integer? x) (>= x 0)) (expt y x) (error 'interpreter "^ is partial"))) (define population (list (list "+" (λ (x) (λ (y) (+ x y)))) (list "*" (curry *)) (list "^" (curry ^)))) (define ENV0 (for/fold ([env empty]) ([name-type population]) (add (first name-type) (second name-type) env))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - (provide ENV0) Figure 56: A Runtime Base Environment, Compatible with Type Base Environment
Stop! Why are the two environments compatible? What is the only error E that one of these primitive operations may signal?
Stop again! Add a function-value to ENV0 that consumes a function on integers and an integer. It applies the function to the given integer and adds 1 to the result. Now develop a compatible type.
And now we can revise the interpreter so that it expresses what the type soundness theorem implies. Figure 57 shows an interpreter without any run-time checks in the inner “loop.” Indeed, by removing the node clause completely, the only check that might exist is the one for “functionness” in the call clause. But, the type system guarantees that we always get a value that we can apply and therefore this check is gone.
Lectures/12/interpreter.rkt
#lang racket ;; an interpreter that does not check any error conditions ;; at run-time (require "rec-type-check.rkt") (require "env0.rkt") (require "strip-types.rkt") (require "../6/environment.rkt") (require "../6/rec-as-data.rkt") (require "../4/possible-values.rkt") #; {Value = Number || (function-value parameter FExpr Env)} #; {FExpr -> Value} ;; determine the value of ae via a substitutione semantics (define (interpreter ae0) #; {FExpr Env -> Value} ;; ACCUMULATOR env tracks all declarations between ae and ae0 (define (i/accu ae env) (match ae [(? integer?) ae] [(decl x a1 a2) (i/accu a2 (add-rec x (λ (env) (i/accu a1 env)) env))] [(? string?) (lookup ae env)] [(call ae1 ae2) (fun-apply (i/accu ae1 env) (i/accu ae2 env))] [(fun para body) (function-value para body env)] [(if-0 t thn els) (define test-value (i/accu t env)) (if (= test-value 0) (i/accu thn env) (i/accu els env))])) #; {Value Value -> Value} (define (fun-apply function-representation argument-value) (match function-representation [(function-value fpara fbody env) (i/accu fbody (add fpara argument-value env))] [op (op argument-value)])) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (define untyped-ae0 (with-handlers ([exn:fail? exn-message]) (define ___typed (type-check ae0)) (strip-types ae0))) (if (string? untyped-ae0) untyped-ae0 (i/accu untyped-ae0 ENV0))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (provide interpreter)
Figure 53 also benefits from the absence of any checks in the
primitive functions—
Stripping Types
The interpreter works on a type-free abstract syntax tree. Figure 58 displays the function that strips types from the source representation.
Lectures/12/strip-types.rkt
#lang racket (require "isl-as-data.rkt") (require "../6/rec-as-data.rkt") #; {TISL -> RecExpr} (define (strip-types isl) (match isl [(? string?) isl] [(? integer?) isl] [(tfun* x t b) (fun x (strip-types b))] [(tcall f a) (call (strip-types f) (strip-types a))] [(tdecl f t rhs b) (decl f (strip-types rhs) (strip-types b))] [(tif-0 tst thn els) (if-0 (strip-types tst) (strip-types thn) (strip-types els))])) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (provide strip-types)
In the mid 1990s, Harper, Lee, and Leroy launched a systematic research program to show that the compiler can rely on and exploit logical types if they are preserved across compilation phases. Northeastern’s very own Prof. Ahmed works in this tradition.
The type-checking phase is completely separate from the interpretation phase.