Referential Transparency
by Matthias Felleisen
Jan 21 2019
Changed in version 1.2: Thu Feb 14 12:30:56 EST 2019,
Tony Garnock-Jones
suggested some additional credits.
Changed in version 1.1: Tue Jan 22 12:30:08 CET 2019, thanks to
Tony Garnock-Jones
for digging up the link to Quine’s book and
pointing out typos
Changed in version 1.0: Sun Jan 20 20:00:11 GMT 2019, initial release
Abelson and Sussman’s highly influential Structure and Interpretation, particularly the section on Sameness and Change, has probably planted the seed for this argument in many people’s mind.
For decades now, programming-language researchers and others have used the phrases “referential transparency” or “referentially transparent” to praise certain programming languages (those that have “it”) and condemn others (those that don’t). Unfortunately, most don’t know how to define these terms properly; if they could, they might think twice about using these phrases in this way.
Substituting Equals for Equals
The most common response to “what is referential transparency” involves the sentence “it is always possible to substitute equals for equals.” The problem is that this reply is too simplistic. programming-language researchers, of all people, should know that the operation of substituting one term for another in a program requires a careful definition, and whether two terms are equal is an entirely separate question. Let’s look at both ideas carefully.
Referentially Transparent Positions
This example is inspired by Albert Meyer, Puzzles in Programming Logic, presented at the Workshop on Formal Software Development: Combining Specification Methods, Nyborg, Denmark, May, 1984.
Clearly, 2 + 2 is always equal to 4. Right?
See Willard Quine’s Word and Object. Studies in Communication (§30 page 144). Cambridge, Massachusetts: Massachusetts Institute of Technology, 1960. The bottom half of page 144 is particularly enlightening.
Quine, a philosopher, pointed out a long time ago that all languages have referentially transparent positions and opaque ones. A position is a place in a “sentence” of the language. A referentially transparent position is a place where we can substitute a reference with its meaning. Positions are best represented as a complete “essay” with a single sentence removed. Quine calls these “essays with a hole” contexts. Since the inside of a quote(d sentence) is not a complete sentence, ripping out a part of a quote does not produce a context. Filling the hole of a context with a sentence yields a complete essay; filling the hole with a different sentence may or may not yield an equivalent essay.
While Quine was thinking about natural languages, it turns out his work applies to artificial ones, too. In an programming language, contexts are programs with one “programming-language sentence” removed. Here a “programming-language sentence” means any of the following: a definition such as val x = 1, an expression such as 1 + 1, a statement such as x := x + 1 or echo(x), or similar pieces. As in natural languages, we cannot think of pieces of a (quoted) string as an expression, so the above example is ruled out. So, not all positions in programs are referentially transparent, and that should raise a little red flag in everyone’s mind.
What Are Equals
when are two programming-language sentences equal?
Obviously we know that 2 + 2 evaluates to 4. But, programming-language sentences may have free variables, that is, they contain references to other “meanings” in the context of the sentence. For example, 2 + x uses x to refer to some yet-to-be-known number and x := 4 refers to some yet-to-be-known assignable variable (location) in the context. Should substituting equals for equals rule out such programming-language sentences? Since that would be rather limiting, we need an equality relationship that accounts for programming-language sentences with references whose meanings we don’t know.
is an equivalence relation (reflexive, symmetric, transitive)
is a congruence relation (can be used at any referentially transparent position), and
can equate sentences without free variables with their meaning.
See James Morris’s Lambda-calculus models of programming languages [page 53], Cambridge, Massachusetts: Massachusetts Institute of Technology, PhhD dissertation, Sloan School of Management, 1969.
operational equivalence.
a sentence e in a programming language is operationally equal to sentence e′
(C[e],0) is in eval iff (C[e′],0) is in eval
Since operational equivalence is the only unique binary relation of the kind we want and since all other such binary relation are subset of this greatest one, it is absolutely natural that we use it to define “equals.”
And this means that all programming languages have referentially transparent positions, including those with side effects and other strange behaviors. Ouch.
Referential Transparency, Definiteness and Unfoldability
Sondergaard and SestoftActa Informatica 27(6) 1990, pages 505-–517 deserve credit for a particularly careful attempt. Starting from Quine’s work, they roughly apply “referentially transparent” to a language whose entire syntax respects Quine’s idea of referentially transparent position. This attempt falls short, however, because it fails to pursue the idea of equality to the very end. Indeed, the definition is so synthetic that it may not apply to any realistic language at all.