3 — Scope, Compilation

Tuesday, 14 January 2020

From now on, we discuss all language ideas in terms of abstract syntax trees, not concrete JSON syntax. Today’s lecture is about variable declarations and uses. The model language is displayed in figure 6.


  #lang racket
  ;; internal representation (AST)
  (struct node [op left right] #:transparent)
  (struct decl [variable value scope] #:transparent)
  #; {VE  = Int || (node + VE VE) || (node * VE VE) ||
          (decl Var VE VE) ||
  #; {Var = String}
  (provide (struct-out node) (struct-out decl))

Figure 6: No More JSON

Stop! Make up JSON syntax to your liking for the new decl construct.

Now we can already ask some questions.

Q: What does x mean as a VE?

A: Correct! Nothing. It is an undeclared, also called free, variable.

Q: Which declaration of x does a reference (of x) refer to?

A: This is the first thing you need to figure out whenever you encounter a language construct that introduces a variable, aka, a binding construct.

Where is a variable declaration visible?

Static scope, also known as lexical scope, is the region where a variable declaration is visible. A region is not necessarily compact; in particular, a region may have holes if a variable of the same name is declared twice.

The adjective static means that a programmer can resolve where a variable is bound and thus where its value originates from by just reading the program text (and without simulating an execution).

If the region is an entire program (or module or package), we speak of global scope; if it is just a portion of the program (or module or package), the declaration has local scope.

In general the word static always implies that there is no need to run the program to figure out a property of the program. It is used to say that a programmer can determine from where variables can their values without by just looking at the program text (the words, lex).

Variables without a declaration are undeclared variable, also known as free variables.

Note 1 Some programming languages do not demand that variables are declared. Instead when the implementation finds a new variable name it declares the corresponding variable and its scope implicitly.

Stop! What problems may implicit variable declarations cause? End

Note 2 Until the late 1980s, some programming languages used dynamic scope for variable declarations. That is, the resolved the variable declaration in the last possible moment, namely, when the language implementation encounters the variable name at run-time.

In general the word dynamic always means “at run time.” End

When you encounter or create a new variable declaration construct for a programming language, it is imperative that you figure out/specify the scope of the bindings.

Explicit static scope for variable declarations directly corresponds to the idea of “definition” found in mathematics. That is, a defined term can be replaced by its definition or as you called it in middle school mathematics, substitution.

Interpretation via Substitution

There are two possibilities, namely, replace the declared variable by its
  • definitional expression

    Mathematical says a defined name can always be replaced with its definition. This alternative seems right for mathematics then.

  • the value of its definitional expression.

    Most programmers think along these lines.

It turns out, both exist in the world of programming languages.

Figure 7 shows how the second interpretation works.


  #lang racket
  (require "data-rep.rkt")
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
  ;; working through an example 
  (define ve-ex1
    (decl "x" (decl "y" 5 {node + "y" "y"})
          (decl "y" 42
                (decl "x" "x"
                      (node + "x" "y")))))
  (define ve-ex2
    (decl "x" {node + 5 5}
          (decl "y" 42
                (decl "x" "x"
                      (node + "x" "y")))))
  (define ve-ex3
    (decl "x" 10
          (decl "y" 42
                (decl "x" "x"
                      (node + "x" "y")))))
  (define ve-ex4
    (decl "y" 42
          (decl "x" 10
                (node + "x" "y"))))
  (define ve-ex5
    (decl "x" 10
          (node + "x" 42)))
  (define ve-ex6
    (node + 10 42))
  (define ve-val1 52)
  (provide ve-ex1 ve-ex2 ve-ex3 ve-ex4 ve-ex5 ve-ex6 ve-val1)

Figure 7: Working Through an Example

Once the idea is understood, writing down the rest of the function is straightforward; see figure 8.


  #lang racket
  (require "data-rep.rkt")
  (require "examples.rkt")
  ;; AE -> Number 
  ;; determine the value of ae via a substitutione semantics 
  (define (value-of-subst ae)
    (match ae
      [(? integer?)   ae]
      [(node o a1 a2) (o (value-of-subst a1) (value-of-subst a2))]
      [(decl x a1 a2) (value-of-subst (subst (value-of-subst a1) x a2))]
      [(? string?)    (error 'value=of "undeclared variable ~e" ae)]))
  ;; Number Var AE -> AE 
  (module+ test
    (check-equal? (subst 21 "x" (decl "x" "x" "x")) (decl "x" 21 "x"))
    (check-equal? (subst 21 "x" (decl "x" 42 "x")) (decl "x" 42 "x")))
  (define (subst val x ae)
    (match ae
      [(? integer?)   ae]
      [(node o a1 a2) (node o (subst val x a1) (subst val x a2))]
      [(decl y a1 a2) (if (equal? x y)
   (decl y (subst val x a1) a2)
   (decl y (subst val x a1) (subst val x a2)))]
      [(? string?)    (if (equal? ae x) val ae)]))
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
  ;; how to turn "working thru an example" into a complete test suite 
  (module+ test
    (require rackunit)
    (check-equal? (value-of-subst ve-ex1) ve-val1)
    (check-equal? (value-of-subst ve-ex2) ve-val1)
    (check-equal? (value-of-subst ve-ex3) ve-val1)
    (check-equal? (value-of-subst ve-ex4) ve-val1)
    (check-equal? (value-of-subst ve-ex5) ve-val1)
    (check-equal? (value-of-subst ve-ex6) ve-val1)) 

Figure 8: Implementing a Substitution Interpreter

The interpreter signals an error if it finds an undeclared variable.

Interpret via Environment

A substitution interpreter is slow. It may have to traverse the same term at least twice.

Instead of substituting, we keep a table of all declared variables and their values and. Then, when we encounter a variable reference, we look up the variable’s value. This table is dubbed an environment.


  #lang racket
   #; {type Env :
            empty :: Env,
            add :: Var Number Env -> Env,
            defined? :: Var Env -> Any,
            lookup :: Var Env -> Number
            position-of :: Var Env -> N}
   empty add defined? lookup position-of)
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  ;; here is one way to implement an environment 
  #; (type Env = [Listof [List Var Number]])
  (define empty '[])
  (define (add x val env) (cons (list x val) env))
  (define (defined? x env) (assoc x env))
  (define (lookup x env) (second (defined? x env)))
  #;  (defined? x env)
  (define (position-of x env)
    (- (length env) (length (member x (map first env)))))

Figure 9: An Environment

An environment guarantees that looking up the value of a variable x retrieves the last value associated with x.


  #lang racket
  (require "data-rep.rkt")
  (require "examples.rkt")
  (require "environment.rkt")
  #; {VE -> Number}
  ;; determine the value of ae via a substitutione semantics 
  (define (value-of-env ae0)
    #; {VE Env -> Number}
    ;; ACCUMULATOR env tracks all declarations between ae and ae0
    (define (value-of/accu ae env)
      (match ae
        [(? integer?)   ae]
        [(node o a1 a2) (o (value-of/accu a1 env) (value-of/accu a2 env))]
        [(decl x a1 a2) (value-of/accu a2 (add x (value-of/accu a1 env) env))]
        [(? string?)
         (if (defined? ae env)
             (lookup ae env)
             (error 'value-of/accu "undeclared variable ~e" ae))]))
    (value-of/accu ae0 empty))
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
  ;; how to turn "working thru an example" into a complete test suite 
  (module+ test
    (require rackunit)
    (check-equal? (value-of-env ve-ex1) ve-val1)
    (check-equal? (value-of-env ve-ex2) ve-val1)
    (check-equal? (value-of-env ve-ex3) ve-val1)
    (check-equal? (value-of-env ve-ex4) ve-val1)
    (check-equal? (value-of-env ve-ex5) ve-val1)
    (check-equal? (value-of-env ve-ex6) ve-val1)) 

Figure 10: Implementing an Environment Interpreter

The Dual Nature of Programming Languages

If we think of a programming language in terms of mathematics, the interpreters and the compiler are mathematical functions. With this in mind, we can state a theorem that relates the compiler and the two interpreters. Specifically, we can claim and prove that the two interpretations are equal functions.

Theorem value-of-subst == value-of-env

What this statement emphasizes are extensional considerations. Intensionally the two ways of running the programs are quite different. Indeed, the point of compilation is to save time, space, energy and otherwise reduce costs.