10 — What and When
Friday, 07 February 2020
Presenters (1) Daniel McGann, Nathaniel Rosenbloom (2) Josh Kazan, Dylan Robinson
What Do Function Calls Evaluate Arguments to
Lectures/10/swap-example.rkt
#lang racket (require "../9/interpreter.rkt") ;; to run the reference interpreter and observe the swap: ;; (require "interpreter-by-ref.rkt") (require "../8/ass-as-data.rkt") (require "../define-names.rkt") ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ;; swap test (define-names swap x y tmp a b) (define swap-example (decl swap [fun x [fun y [decl tmp x [sequ [set x y] [set y tmp]]]]] [decl a 10 [decl b 20 [sequ [call [call swap a] b] a]]])) (interpret swap-example)
Take a close look at the swap program snippet in figure 41.
Stop! What do you expect the result to be?
The question really is whether somehow swap can mutate the variable a. Running the interpreter on this example answers this question in the negative. Stop! Why?
[call f a] ; as abbreviating [decl x a b]
Opening a new scope hence means allocating a new location and sticking the parameter-location combination onto the environment (i.e., scope) from the function definition site. A new location means the interpreter determines the next available location and adds the location-argument-value combination to the store. Hence, a set to a function parameter cannot affect the location of the original variable.
The technical term is pass-by-value, meaning a function call transmits values not locations. Hence it is impossible in a call-by-value language for an assignment to a function parameter to modify the actual argument if it was a variable, too. Period.
In a full-fledged programming language, the set of values may include
arrays, lists, functions and objects. No matter what, it’s the value that
gets transmitted not the location of these “things.” And
that leads to confusion with many people, and you can find this confusion on
many on-line fora. The tricky aspect is that change a field of a
vector/array/list/object (or a variable reachable from the body of a
function) works just fine. 5 —
(decl cell [fun c [decl content c [fun get-or-set (if-0 get-or-set content [fun new-value [set content new-value]])]]] ...)
(decl swap [fun cell-1 [fun cell-2 [decl tmp [call cell-1 0] [sequ [call [call cell-1 1] [call cell-2 0]] [call [call cell-2 1] tmp]]]]] [decl a [call cell 42] [decl b [call cell 21] [sequ [call [call swap a] b] [call a 0]]]])
Pass by Reference
In principle, all modern languages offer only with-value argument passing.
But, because old languages supported alternatives, old and confused
developers have continued to push certain terminologies—
The most important alternative to know about is pass-by-reference.
This argument-passing technique deals with one special case. If the argument expression is a variable, it hands the variable’s location to the corresponding parameter. Magically, swap works now.
Stop! Why?
Lectures/10/interpreter-by-ref.rkt
#lang racket ;; a store-passing interpreter that explains pass-by-ref (provide #; {FExpr -> Value} ;; determine the value of ae via a substitutione semantics interpret) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (require "../6/environment.rkt") (require "../8/ass-as-data.rkt") (require "../4/possible-values.rkt") (require "../9/store.rkt") (require "tag-values-with.rkt") (require SwDev/Debugging/spy) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - #; {Value = Number || (function-value parameter FExpr Env)} (define UNDECLARED "undeclared variable ~e") (define ARITHMETIC "number expected, given ~e ") (define CLOSURE "closure expected, given ~e ") (define (interpret ae0) #; {FExpr Env Store ->* Value Store} ;; ACCUMULATOR env tracks all declarations between ae and ae0 (define (interpret ae env store) (match ae [(? integer?) (values ae store)] [(node o a1 a2) (define-values (right0 store+) (interpret a2 env store)) (define right (number> right0)) (define-values (left0 store++) (interpret a1 env store+)) (define left (number> left0)) (values (o left right) store++)] [(decl x a1 a2) (define-values (loc store+) (alloc store #f)) (define env++ (add x loc env)) (define-values (val store++) (interpret a1 env++ store+)) (interpret a2 env++ (update store++ loc val))] [(? string?) (if (defined? ae env) (values (retrieve store (lookup ae env)) store) (error 'value-of UNDECLARED ae))] ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [(call ae1 ae2) (define-values (right store+) (if (string? ae2) (tag-first-value-with 'ref (lookup ae2 env) store) (tag-first-value-with 'val (interpret ae2 env store)))) (define-values (left store++) (interpret ae1 env store+)) (fun-apply (function> left) right store++)] ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [(fun para body) (values (function-value para body env) store)] [(if-0 t thn els) (define-values (test-value store+) (interpret t env store)) (if (and (number? test-value) (= test-value 0)) (interpret thn env store+) (interpret els env store+))] [(set lhs rhs) (define loc (lookup lhs env)) (define old (retrieve store loc)) (define-values (val store+) (interpret rhs env store)) (values old (update store+ loc val))] [(sequ fst rst) (define-values (_ store+) (interpret fst env store)) (interpret rst env store+)])) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - #; {type ArgValue [U (list 'value Value) (list 'ref Location)]} ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - #; {Value ArgValue Store ->* Value Store} (define (fun-apply function-representation argument-value store) (match function-representation [(function-value fpara fbody env) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (define-values (loc store+) (match argument-value [(list 'ref loc) (values loc store)] [(list 'val v) (alloc store v)])) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (interpret fbody (add fpara loc env) store+)])) (define-values (result store) (interpret ae0 empty plain)) result) #; {Any -> Number} (define (number> x) (if (number? x) x (error 'interpreter ARITHMETIC x))) #; {Any -> Function} (define (function> x) (if (function-value? x) x (error 'interpreter CLOSURE x)))
Downside
Pass-by-reference introduces direct aliasing of variables—
What people really want is aliasing of values, which is achieves a similar effect, via one level of indirection. Remember the law of CS that every problem can be solved with one level of indirection. Since creating indirections is a locally solvable problem, developers are mostly happy to use pass-by-value languages.
One last note: old developers often claim that languages use by-reference because of the capability to mutate fields in vectors/arrays/lists/objects. This is inaccurate and incurable terminology. But you now know better.
When Do Function Calls Evaluate Arguments
Now let’s look at an orthogonal dimension of function calling, the point when arguments should be evaluated. Figure 43 displays a simplistic program that makes the point:
Lectures/10/delay-test.rkt
#lang racket (require "../9/interpreter.rkt") (require "../8/ass-as-data.rkt") (require "../define-names.rkt") (require (prefix-in delay: "interpreter-delay.rkt")) (require (prefix-in name: "name-promise.rkt")) (require (prefix-in lazy: "lazy-promise.rkt")) ;; to run the call-by-name vs the call-by-lazy interpreter ; (define interpret (delay:interpret lazy:promise lazy:promise->value)) ; (define interpret (delay:interpret name:promise name:promise->value)) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ;; constant test (define-names constant x) (define constant-example (decl constant [fun x 42] (call constant (node + (fun x x) 1)))) ; should the following error or run: ; (interpret constant-example) (define-names y) (define constant-example2 (decl constant [fun x x] (decl y (call constant [fun x (node + (fun x x) 1)]) 42))) ; this one will run: (interpret constant-example2)
Stop! What should the result be?
Should this program report an arithmetic error because addition doesn’t work on functions?
Should it output 42 because the argument to constant doesn’t matter?
Delay the Evaluation
Plotkin was the first to propose a rigorously testable idea of how the λ calculus relates to programming languages. Sadly many PL researchers still don’t quite understand his inspiring ideas.
((λ p. body) a) ⟶ body[with p replaced by a]
“to evaluation a specific function call (haha, math teachers don’t know these words) replace the function’s parameter with the function’s argument in the function’s body”
let f be (λ x. 2 * x)
f[40 + 1]
(λ x. 2 * x) [40 + 1]
let x be [40 + 1]
what is the value of 2 * x
2 * [40 + 1]
In short, we always want to replace variables with their definitions and the application of a function creates a new definition.
This way of looking at the world of computation inspired people in the late 1960s to delay the evaluation of arguments because they were considered the definitions of variables used as function parameters.
Lectures/10/interpreter-delay.rkt
#lang racket ;; an interpreter that explicitly orders the evaluation of ;; every subexpression where the ordering wasn't specified ;; RIGHT to LEFT (require "../6/environment.rkt") (require "../6/rec-as-data.rkt") (require "../4/possible-values.rkt") #; {FExpr -> Value} ;; determine the value of ae via a substitutione semantics (define ((interpret promise promise->value) ae0) #; {FExpr Env[Promise] -> Value} ;; ACCUMULATOR env tracks all declarations between ae and ae0 (define (interpret ae env) (match ae [(? integer?) ae] [(node o a1 a2) (define right (number> (promise->value (interpret a2 env)))) (define left (number> (promise->value (interpret a1 env)))) (o left right)] [(decl x a1 a2) (interpret a2 (add-rec x (λ (env) (interpret a1 env)) env))] [(? string?) (if (defined? ae env) (lookup ae env) (error 'value-of "undeclared variable ~e" ae))] ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [(call ae1 ae2) (define right (promise (λ () (interpret ae2 env)))) (define left (function> (interpret ae1 env))) (fun-apply left right)] ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [(fun para body) (function-value para body env)] [(if-0 t thn els) (define test-value (promise->value (interpret t env))) (if (and (number? test-value) (= test-value 0)) (interpret thn env) (interpret els env))])) #; {Value Value -> Value} (define (fun-apply function-representation argument-value) (match function-representation [(function-value fpara fbody env) (interpret fbody (add fpara argument-value env))])) (promise->value (interpret ae0 empty))) #; {Any -> Number} (define (number> x) (if (integer? x) x (error 'interpreter "integer expected, given ~e " x))) #; {Any -> Function} (define (function> x) (if (function-value? x) x (error 'interpreter "function-value expected, given ~e " x))) ;; - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - (provide interpret (rename-out [interpret value-of]))
; Promise -> Value ; run the function inside a promise until a value shows up (define (promise->value p) ...)
Figure 44 shows where to use promise to delay the evaluation of an argument and where to use promise->value to extract it. All we need now is an implementation of our wish list entry, and it turns out there are several ways to do this, so the interpreter is abstracted over the delaying tactic.
What the interpreter shows is that a promise flows thru the program until it hits a strictness point, that is, a primitive operation such as + or if-0 that truly needs to know the actual value.
Call-By-Value
Call-By-Name
Algol 60 and programming languages in the Algol family used
call-by-name. It literally implements the idea of replacing names
with their definitions—
Lectures/10/name-promise.rkt
#lang racket (provide (struct-out promise) promise->value) (struct promise [p] #:transparent) #; {Promise = (U (promise [-> Promise]) Value)} #; {Value = Number || (function-value parameter FExpr Env)} #; {(U Promise Value) -> Value} (define (promise->value Promise-or-value) (cond [(promise? Promise-or-value) (promise->value [(promise-p Promise-or-value)])] [else Promise-or-value]))
Stop! Explain promise->value using its function signature.
Downside
Call-by-Need aka Laziness
Evaluating a function argument every time it is referenced in the function’s body is expensive and, as the print example shows, counter-intuitive. We usually don’t want to repeat this.
Call-by-need was arguable invented three times within a year by (1) Friedman & Wise (a small interpreter for Lazy Lisp), (2) Henderson & Morris (an idea paper), and (3) Wadsworth (theory).
Lectures/10/lazy-promise.rkt
#lang racket (provide (struct-out promise) promise->value) (struct promise [{p #:mutable}] #:transparent) #; {Promise = (promise {U Promise [-> Value]})} #; {Value = Number || (function-value parameter FExpr Env)} #; {(U Promise Value) -> Value} (define (promise->value promise-or-value) (cond [(promise? promise-or-value) (define p [(promise-p promise-or-value)]) (cond [(procedure? p) (define result (promise->value p)) (set-promise-p! promise-or-value result) result] [else p])] [else promise-or-value]))
Figure 46 explains how this is implemented with a cache in the promise structure.
Stop! Why does caching “work”?
Since the idea of a functional language is that every expression always evaluates to the same value, remembering what an expression evaluated to the first time is legitimate.
Hughes’s famous article on the advantages of functional programming makes a big deal of lazy or by-need evaluation. I think that the advertised advantages do not pay off in the long run, but you now have a pointer to an alternative view on programming with call-by-need.
Downside
Call-by-need demands a lot more from the compiler than call-by-value. Creating a function and allocating a promise costs memory. It also costs another function call when the argument is needed. Worse, every time we need the value we need some form of test to figure out whether we have the final value. (We could avoid this test with self-modifying code .. but modern machines hate, absolutely hate self-modifying code.)
The standard technique for addressing this inefficiency problem is a so-called strictness analysis, which figures out whether the argument value is needed by the function and then eliminates the promise creation.
Call-by-need demands a lot from performance evaluators—
The standard solution is for developers to annotate functions as being
strict. This forces compilers to evaluate arguments earlier than with
laziness—
Lazy Constructors
See Chang’s dissertation on the relationship between laziness and strictness.
The Two Dimensions
Ada, like Cobol, is a DoD-commissioned language.
Along the what dimension, we have seen two major variants: pass-by-value vs pass-by-reference. The Ada programming language supported pass-by-copy-in/out.
Along the when dimension, we have seen three major variants: call-by-value, call-by-name, and call-by-need (lazy).
The two dimensions are reasonably orthogonal so you can get a language for almost any combination. The following table fills in some unusual spots.
| pass-by-value |
| pass-by-reference |
| pass-by-copy | |
call-by-value |
| modern PLs |
|
| ||
call-by-name |
|
| Algol ’60 |
| ||
call-by-need |
| Haskell |
|
|