Church Rosser
Jun 2 2024
Changed in version 1.0: Sun Jun 2 14:04:32 EDT 2024, initial release
Over the past couple of months, several people have brought up the Church-Rosser property in conversation, and I was somewhat amazed at the uninformed opinions that these people reported.
As someone who has worked out his fair share of Church-Rosser theorems for several variants of λ-calculus-based language , I figured I would write down what I learned in this regard over the course of 40 years.
Church and Rosser Prove that their Calculus is Sound
Neither Church nor Rosser designed a programming language. The λ-calculus per se is a programming language only as much as Turing machines are. So when someone says “people in 1936 knew how to deal with property X in programming languages via the Church-Rosser theorem” then this statement is (1) anachronistic and (2) wrong as history showed 40 years later.
‘A ⊢ true = false’ is impossible.
A proof system that validates only true statements is sound.
Church and his colleagues had come up with several such systems, which Barendregt (North Holland, 1980) labels as λI, λK, λβ, λβη, and so on. Also, orienting these axioms into notions of reduction generates the same set of equations as using a plain proof system starting from equational axioms.
From λ-Calculus to Programming Languages
In the late 1950s and 1960s our colleagues started to realize that the
λ-calculus is an interesting basis for discussing aspects of programming
languages. One aspect was scope; when is a variable occurrence x bound by
a λ x. in its context—
And finally, Bohem, McCarthy, Landin, Reynolds, Morris, and others decided to design programming languages directly based on the λ-calculus. They added literal constants and libraries of functions, plus various conditionals and other forms. They were clever in explaining the meaning of their programs: they used evaluation functions (interpreters) often based on abstract machines; Landin’s SECD machine is a well-known example.
Notably they did not use Church’s axioms to specify the semantics of their designs.
Call-by-name, Call-by-value and the λ-calculus
This last step—
Algol’60 implements call-by-value as a compiler transformation, adding an assignment statement to the parameter at the entry of the procedure. The question had become interesting because some language designs used Algol’60’s call-by-name regime, evaluating argument expressions for every parameter occurrence. Others used call-by-value, meaning the argument expression was evaluated once and for all to a value before the actual function call took place.
it is possible to define the meaning of a language via a set of Church-style axioms;
but, these notions of reduction are not the semantics;
the semantics is a function from programs to values, meaning it is always necessary to identify the subset of values in a language’s set of expressions;
different systems of axioms define different evaluation functions;
one system of axioms (λn), similar to Church’s, defines what language researchers label a language with a call-by-name convention;
a different set of axioms (λv) defines what language researchers label a language with a call-by-value convention;
the functions defined via sets of axioms (notions of reduction) are equal to the functions defined by abstract machines that implement the various call-by regimes.
What is the Role of Church-Rosser in this Context?
To extract an efficient evaluation function, Plotkin proves that
both calculi also satisfy the Curry-Feys Standardization property.
A semantics generated from notions of reduction is a relation. To prove that it
is a function—
One of several lemmas in this proof is a variant of Church and Rosser’s lemma. That’s it.
What is The Truth?
From Church and Rosser’s work, we know that λ ⊢ e = e′ is a proof that e = e′ is a true statement. But what is The Truth in the context of programming languages?
Following Morris’s dissertation (MIT 1968), Plotkin goes on to define
observational equivalence for each semantics, a congruent equivalence
relation on expressions. Two expressions e and e′ are
observationally equal, written e ≈ e′, if after plugging e and
e′ into any arbitrary program context C—
Observational equivalence is the largest such relation, and as such it is recognized as The Truth. The Truth for any programming language is vastly larger than the set of equations of an axiomatic syntactic calculus. Indeed, The Truth for such λ-ish languages cannot be axiomatized. Period.
Don’t We Want Equations for Optimizing Compilers?
The Truth (e ≈ e′) is so much larger than this one particular proof system (λ ⊢ e ≈ e′),
and we want powerful optimizing passes,
Instead, weq want as many true equations e ≈ e′ turned into transformations e ⟶ e′ if they can somehow improve the performance of a program (without changing its outcome). So what we really want are proof techniques for establishing The Truth between two expressions, one improving on the other.
Church-Rosser and notions of reductions are too simplistic for this job.
Use denotational semantics if you want lots of equations. Use logical relations to validate your favorite transformations. Use bisimulations. Use any proof technique that works.
Church-Rosser for axioms of notions of reduction is way too weak.
Don’t We then Get Nondeterministic Compilers?
Yes.
Depending on when a transformation kicks in, we get different object code. But because they are validated elements of The Truth, running any of these outcomes produces the same value.
And that’s true in reality. I conjecture that all large compilers are non-deterministic translators.
Is it a Good Idea to Define a Semantics via Church-style Axioms?
No.
Ploktin’s work shows that we want an evaluation function first. Define it via an abstract machine. Use relations on the surface syntax. Write an interpreter. But define the function first.
If, for some reason, you want a simple calculus such as λ, develop it after you have worked out the semantic function. Then use Church-Rosser to prove consistency of the reduction-axiom-based function and the original one.
for call/cc-style continuations (LICS ’86, with Kohlbecker, Duba, and Friedman);
a simpler calculus than the one in item 1 (TCS ’92, with Hieb);
for delimited continuations (POPL ’88);
for assignable variables in a λ-calculus-based language (POPL ’87, with Friedman);
a simpler calculus than the one in item 4 (TCS ’92, with Hieb);
for a language with exceptions (Info & Comp ’94, with Wright); and
for call-by-need (JFP ’97, with Ariola; ESOP ’12, with Chang).
I also had one for futures (POPL ’95, with Flanagan; but it’s not in the paper).
And What About Nondeterminism?
A nondeterministic language may evaluate the same program (and input) to distinct values. The semantics is not a function but a relation.
Designing a Church-Rosser calculus for such a language should be possible, using reduction on sets instead of expressions directly. But, this is just a boring exercise in proving some people wrong about the relationship between Church-Rosser and programming-language research. I don’t have time for this and there aren’t enough bits in this laptop to write the proof down in the margin.