7.7.0.3

10 — What and When

Friday, 07 February 2020

Presenters (1) Daniel McGann, Nathaniel Rosenbloom (2) Josh Kazan, Dylan Robinson

What Do Function Calls Evaluate Arguments to

Lectures/10/swap-example.rkt

  #lang racket
   
  (require "../9/interpreter.rkt")
  ;; to run the reference interpreter and observe the swap:
  ;; (require "interpreter-by-ref.rkt")
   
  (require "../8/ass-as-data.rkt")
  (require "../define-names.rkt")
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  ;; swap test
  (define-names swap x y tmp a b)
    
  (define swap-example
    (decl swap [fun x
                    [fun y
                         [decl tmp x
                               [sequ [set x y]
                                     [set y tmp]]]]]
          [decl a 10
                [decl b 20
                      [sequ [call [call swap a] b]
                            a]]]))
  (interpret swap-example)
   

Figure 41: (tt swap)

Take a close look at the swap program snippet in figure 41.

Stop! What do you expect the result to be?

The question really is whether somehow swap can mutate the variable a. Running the interpreter on this example answers this question in the negative. Stop! Why?

Every time a function [fun x b] is called, you can think of
[call f a]
; as abbreviating
[decl x a b]
That is, a call acts like the introduction of a new decl. The interpreter opens a scope for the declared variable—the function parameter—and maps it to an initial value—the argument value. The mapping is goes through both the environment, which associates the name with a fixed location, and the store, which then relates the location to the actual value.

Opening a new scope hence means allocating a new location and sticking the parameter-location combination onto the environment (i.e., scope) from the function definition site. A new location means the interpreter determines the next available location and adds the location-argument-value combination to the store. Hence, a set to a function parameter cannot affect the location of the original variable.

The technical term is pass-by-value, meaning a function call transmits values not locations. Hence it is impossible in a call-by-value language for an assignment to a function parameter to modify the actual argument if it was a variable, too. Period.

In a full-fledged programming language, the set of values may include arrays, lists, functions and objects. No matter what, it’s the value that gets transmitted not the location of these “things.” And that leads to confusion with many people, and you can find this confusion on many on-line fora. The tricky aspect is that change a field of a vector/array/list/object (or a variable reachable from the body of a function) works just fine. 5 — Simple Mutable Objects demonstrates this point with the introduction of an simplistic object with one mutable field, a getter, and a setter method.

Even in our toy language we can seemingly switch the content of two simulated cell objects (one constructor, one field, a setter, a getter):
(decl cell [fun c
             [decl content c
               [fun get-or-set
                 (if-0 get-or-set
                   content
                   [fun new-value [set content new-value]])]]]
  ...)
This first block of code simulates a simplistic cell object with one field (content) that is initialized to the constructor argument (c). The return value is a function that can play the role of a getter (when called on 0) or setter (when called on 1).

Now that we have a level of indirection we can define a swap-like function:
(decl swap [fun cell-1
             [fun cell-2
               [decl tmp [call cell-1 0]
                 [sequ [call [call cell-1 1] [call cell-2 0]]
                   [call [call cell-2 1] tmp]]]]]
  [decl a [call cell 42]
    [decl b [call cell 21]
      [sequ [call [call swap a] b]
        [call a 0]]]])
Except it doesn’t swap the values of a and b but the values that reside inside of a and b.

Pass by Reference

In principle, all modern languages offer only with-value argument passing. But, because old languages supported alternatives, old and confused developers have continued to push certain terminologies—mostly because they consider examples such as swap to be important and fondly remember languages in which it was possible to express this swap idea directly.

The most important alternative to know about is pass-by-reference.

This argument-passing technique deals with one special case. If the argument expression is a variable, it hands the variable’s location to the corresponding parameter. Magically, swap works now.

Stop! Why?

Lectures/10/interpreter-by-ref.rkt

  #lang racket
   
  ;; a store-passing interpreter that explains pass-by-ref 
   
  (provide
   #; {FExpr -> Value}
   ;; determine the value of ae via a substitutione semantics 
   interpret)
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
  (require "../6/environment.rkt")
  (require "../8/ass-as-data.rkt")
  (require "../4/possible-values.rkt")
  (require "../9/store.rkt")
  (require "tag-values-with.rkt")
  (require SwDev/Debugging/spy)
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
  #; {Value = Number || (function-value parameter FExpr Env)}
   
  (define UNDECLARED "undeclared variable ~e")
  (define ARITHMETIC  "number expected, given ~e ")
  (define CLOSURE     "closure expected, given ~e ")
   
  (define (interpret ae0)
   
    #; {FExpr Env Store ->* Value Store}
    ;; ACCUMULATOR env tracks all declarations between ae and ae0
    (define (interpret ae env store)
      (match ae
        [(? integer?)
         (values ae store)]
        [(node o a1 a2)
         (define-values (right0 store+) (interpret a2 env store))
         (define right (number> right0))
         (define-values (left0 store++) (interpret a1 env store+))
         (define left  (number> left0))
         (values (o left right) store++)]
        [(decl x a1 a2)
         (define-values (loc store+) (alloc store #f))
         (define env++ (add x loc env))
         (define-values (val store++) (interpret a1 env++ store+))
         (interpret a2 env++ (update store++ loc val))]
        [(? string?)
         (if (defined? ae env)
             (values (retrieve store (lookup ae env)) store)
             (error 'value-of UNDECLARED ae))]
        ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
        [(call ae1 ae2)
         (define-values (right store+)
           (if (string? ae2)
               (tag-first-value-with 'ref (lookup ae2 env) store)
               (tag-first-value-with 'val (interpret ae2 env store))))
         (define-values (left store++) (interpret ae1 env store+))
         (fun-apply (function> left) right store++)]
        ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
        [(fun para body)
         (values (function-value para body env) store)]
        [(if-0 t thn els)
         (define-values (test-value store+) (interpret t env store))
         (if (and (number? test-value) (= test-value 0))
             (interpret thn env store+)
             (interpret els env store+))]
        [(set lhs rhs)
         (define loc (lookup lhs env))
         (define old (retrieve store loc))
         (define-values (val store+) (interpret rhs env store))
         (values old (update store+ loc val))]
        [(sequ fst rst)
         (define-values (_ store+) (interpret fst env store))
         (interpret rst env store+)]))
   
    ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
    #; {type ArgValue [U (list 'value Value) (list 'ref Location)]}
    ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
   
    #; {Value ArgValue Store ->* Value Store}
    (define (fun-apply function-representation argument-value store)
      (match function-representation
        [(function-value fpara fbody env)
         ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
         (define-values (loc store+)
           (match argument-value
             [(list 'ref loc)  (values loc store)]
             [(list 'val v)    (alloc store v)]))
         ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
         (interpret fbody (add fpara loc env) store+)]))
    
    (define-values (result store) (interpret ae0 empty plain))
   
    result)
   
  #; {Any -> Number}
  (define (number> x)
    (if (number? x)
        x
        (error 'interpreter ARITHMETIC x)))
   
  #; {Any -> Function}
  (define (function> x)
    (if (function-value? x)
        x
        (error 'interpreter CLOSURE x)))
   

Figure 42: Pass-by-Reference Interpreter

Downside

Pass-by-reference introduces direct aliasing of variables—two names for one and the same location. Over time developers have recognized the danger of aliasing, and they have invented entire new languages to get around it (Rust).

What people really want is aliasing of values, which is achieves a similar effect, via one level of indirection. Remember the law of CS that every problem can be solved with one level of indirection. Since creating indirections is a locally solvable problem, developers are mostly happy to use pass-by-value languages.

One last note: old developers often claim that languages use by-reference because of the capability to mutate fields in vectors/arrays/lists/objects. This is inaccurate and incurable terminology. But you now know better.

When Do Function Calls Evaluate Arguments

Now let’s look at an orthogonal dimension of function calling, the point when arguments should be evaluated. Figure 43 displays a simplistic program that makes the point:

Lectures/10/delay-test.rkt

  #lang racket
   
  (require "../9/interpreter.rkt")
  (require "../8/ass-as-data.rkt")
  (require "../define-names.rkt")
   
  (require (prefix-in delay: "interpreter-delay.rkt"))
  (require (prefix-in name: "name-promise.rkt"))
  (require (prefix-in lazy: "lazy-promise.rkt"))
   
  ;; to run the call-by-name vs the call-by-lazy interpreter 
   
  ; (define interpret (delay:interpret lazy:promise lazy:promise->value))
  ; (define interpret (delay:interpret name:promise name:promise->value))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
  ;; constant test 
   
  (define-names constant x)
    
  (define constant-example
    (decl constant [fun x 42]
          (call constant (node + (fun x x) 1))))
   
  ; should the following error or run: 
  ; (interpret constant-example)
   
  (define-names y)
   
  (define constant-example2
    (decl constant [fun x x]
          (decl y (call constant [fun x (node + (fun x x) 1)])
                42)))
   
  ; this one will run:
  (interpret constant-example2)
   

Figure 43: A Constant Function

Stop! What should the result be?

Should this program report an arithmetic error because addition doesn’t work on functions?

Should it output 42 because the argument to constant doesn’t matter?

Delay the Evaluation

Plotkin was the first to propose a rigorously testable idea of how the λ calculus relates to programming languages. Sadly many PL researchers still don’t quite understand his inspiring ideas.

The central axiom of the λ calculus, the calculus of nameless functions, is β and it is often written as follows:

((λ p. body) a) ⟶ body[with p replaced by a]

It is precisely what teachers tell you about function applications:

“to evaluation a specific function call (haha, math teachers don’t know these words) replace the function’s parameter with the function’s argument in the function’s body”

Note how this rule does not mention the word “value”.

One way to look at all this is that β axiom and your teacher’s rule generalize the idea of a mathematical definition. Wherever you see a name, you may replace it with its definition. So, if we say

let f be (λ x. 2 * x)

and then demand to know the value of

f[40 + 1]

we know what to do. We replace f with its definition:

(λ x. 2 * x) [40 + 1]

This may have surprised you a little bit but f is the name defined so it is f that is replaced with its definition. But what does this replacement do? It essentially created a new definition and a new question, namely,

let x be [40 + 1]

and

what is the value of 2 * x

And now we know how to proceed. We replace x with its definition:

2 * [40 + 1]

which yields 82.

In short, we always want to replace variables with their definitions and the application of a function creates a new definition.

This way of looking at the world of computation inspired people in the late 1960s to delay the evaluation of arguments because they were considered the definitions of variables used as function parameters.

Lectures/10/interpreter-delay.rkt

  #lang racket
   
  ;; an interpreter that explicitly orders the evaluation of
  ;; every subexpression where the ordering wasn't specified
  ;; RIGHT to LEFT 
   
  (require "../6/environment.rkt")
  (require "../6/rec-as-data.rkt")
  (require "../4/possible-values.rkt")
   
  #; {FExpr -> Value}
  ;; determine the value of ae via a substitutione semantics 
  (define ((interpret promise promise->value) ae0)
   
    #; {FExpr Env[Promise] -> Value}
    ;; ACCUMULATOR env tracks all declarations between ae and ae0
    (define (interpret ae env)
      (match ae
        [(? integer?)
         ae]
        [(node o a1 a2)
         (define right (number> (promise->value (interpret a2 env))))
         (define left  (number> (promise->value (interpret a1 env))))
         (o left right)]
        [(decl x a1 a2)
         (interpret a2 (add-rec x (λ (env) (interpret a1 env)) env))]
        [(? string?)
         (if (defined? ae env)
      (lookup ae env)
      (error 'value-of "undeclared variable ~e" ae))]
         ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
        [(call ae1 ae2)
         (define right (promise (λ () (interpret ae2 env))))
         (define left  (function> (interpret ae1 env)))
         (fun-apply left right)]
        ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
        [(fun para body)
         (function-value para body env)]
        [(if-0 t thn els)
         (define test-value (promise->value (interpret t env)))
         (if (and (number? test-value) (= test-value 0))
             (interpret thn env)
             (interpret els env))]))
   
    #; {Value Value -> Value}
    (define (fun-apply function-representation argument-value)
      (match function-representation
        [(function-value fpara fbody env)
         (interpret fbody (add fpara argument-value env))]))
   
    (promise->value (interpret ae0 empty)))
   
  #; {Any -> Number}
  (define (number> x)
    (if (integer? x)
        x
        (error 'interpreter "integer expected, given ~e " x)))
   
  #; {Any -> Function}
  (define (function> x)
    (if (function-value? x)
        x
        (error 'interpreter "function-value expected, given ~e " x)))
   
  ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   
  (provide interpret (rename-out [interpret value-of]))
   
   

Figure 44: A Delaying Interpreter

A delayed argument is essentially a promise that we can determine the value of the argument. Expressing this idea in an interpreter calls for a new class of values:
(struct promise [p])
; A Value is one of:
;  Int
; ...
;  (promise [-> Value]) also called Promise
In the Racket world, function calls evaluate arguments and to prevent evaluation we know to wrap expressions with (lambda () ...). This is what the revised data definition expresses. Of course, we also need a way to turn a promise into a value:
; Promise -> Value
; run the function inside a promise until a value shows up
(define (promise->value p) ...)

Figure 44 shows where to use promise to delay the evaluation of an argument and where to use promise->value to extract it. All we need now is an implementation of our wish list entry, and it turns out there are several ways to do this, so the interpreter is abstracted over the delaying tactic.

What the interpreter shows is that a promise flows thru the program until it hits a strictness point, that is, a primitive operation such as + or if-0 that truly needs to know the actual value.

Call-By-Value

Almost all modern programming languages use call-by-value, which means they immediately run the delaying function and determine the value:
(define (promise f) (f))
(define (promise->value p) p)

Call-By-Name

Algol 60 and programming languages in the Algol family used call-by-name. It literally implements the idea of replacing names with their definitions—and thus evaluation an argument every single time when there is a reference in the function body to the corresponding function parameter. Figure 45 shows what promise and promise->value need to be

Lectures/10/name-promise.rkt

  #lang racket
   
  (provide
   (struct-out promise)
   promise->value)
   
  (struct promise [p] #:transparent)
  #; {Promise = (U (promise [-> Promise]) Value)}
  #; {Value = Number || (function-value parameter FExpr Env)}
   
  #; {(U Promise Value) -> Value}
  (define (promise->value Promise-or-value)
    (cond
      [(promise? Promise-or-value)
       (promise->value [(promise-p Promise-or-value)])]
      [else Promise-or-value]))
   

Figure 45: Name Promises

Stop! Explain promise->value using its function signature.

Downside

Stop again! Take a look at this program
(decl f (fun x (node + x x))
      (call f [print "hello world\n"]))
Okay, we don’t have print and strings in our model language but imagine for a moment that we had extended the language with such useful things. How many times would you see "hello world" printed out?

Call-by-Need aka Laziness

Evaluating a function argument every time it is referenced in the function’s body is expensive and, as the print example shows, counter-intuitive. We usually don’t want to repeat this.

Call-by-need was arguable invented three times within a year by (1) Friedman & Wise (a small interpreter for Lazy Lisp), (2) Henderson & Morris (an idea paper), and (3) Wadsworth (theory).

o address these problems to some extent, functional languages implement call-by-need, a parameter-passing mechanism that evaluates a function argument only if it is ever needed and then only once.

Lectures/10/lazy-promise.rkt

  #lang racket
   
  (provide
   (struct-out promise)
   promise->value)
   
  (struct promise [{p #:mutable}] #:transparent)
  #; {Promise = (promise {U Promise [-> Value]})}
  #; {Value   = Number || (function-value parameter FExpr Env)}
   
  #; {(U Promise Value) -> Value}
  (define (promise->value promise-or-value)
    (cond
      [(promise? promise-or-value)
       (define p [(promise-p promise-or-value)])
       (cond
         [(procedure? p) 
          (define result (promise->value p))
          (set-promise-p! promise-or-value result)
          result]
         [else p])]
      [else promise-or-value]))
   

Figure 46: Lazy Promises

Figure 46 explains how this is implemented with a cache in the promise structure.

Stop! Why does caching “work”?

Since the idea of a functional language is that every expression always evaluates to the same value, remembering what an expression evaluated to the first time is legitimate.

Hughes’s famous article on the advantages of functional programming makes a big deal of lazy or by-need evaluation. I think that the advertised advantages do not pay off in the long run, but you now have a pointer to an alternative view on programming with call-by-need.

Downside

Call-by-need demands a lot more from the compiler than call-by-value. Creating a function and allocating a promise costs memory. It also costs another function call when the argument is needed. Worse, every time we need the value we need some form of test to figure out whether we have the final value. (We could avoid this test with self-modifying code .. but modern machines hate, absolutely hate self-modifying code.)

The standard technique for addressing this inefficiency problem is a so-called strictness analysis, which figures out whether the argument value is needed by the function and then eliminates the promise creation.

Call-by-need demands a lot from performance evaluators—both the tools and the human beings. Nobody knows why a program is slow, whether it’s the algorithm or the languge.

The standard solution is for developers to annotate functions as being strict. This forces compilers to evaluate arguments earlier than with laziness—but it also changes the behavior of the program and the laws for reasoning about its behavior.

Lazy Constructors

See Chang’s dissertation on the relationship between laziness and strictness.

What developers truly need—pardon the pun—are lazy data constructors. If a program has to create a very large data structure only to use a small portion of it, the creation should be done lazily so that the program allocates as little memory as necessary for the exploration of the large data structure.

The Two Dimensions

Ada, like Cobol, is a DoD-commissioned language.

Along the what dimension, we have seen two major variants: pass-by-value vs pass-by-reference. The Ada programming language supported pass-by-copy-in/out.

Along the when dimension, we have seen three major variants: call-by-value, call-by-name, and call-by-need (lazy).

The two dimensions are reasonably orthogonal so you can get a language for almost any combination. The following table fills in some unusual spots.

     

pass-by-value

     

pass-by-reference

     

pass-by-copy

call-by-value

     

modern PLs

     

     

call-by-name

     

     

Algol ’60

     

call-by-need

     

Haskell