7.7.0.3

11 — CESK

Due Tuesday, 07 April, 6am

Exemption This homework is not subject to the “drop a homework grade” rule. It will count regardless of how it affects your grade.

The purpose of this homework is (1) to practice what you have learned over the course of this semester on a lightweight scale (steps 0 through 3 and 5) and (2) to develop a CESK machine for the kernel of a traditional language (step 4).

The first objectives gives you a chance to demonstrate that you understood the concepts and learned from your mistake(s). I will allocate a correspondingly large number of points to The code inspection part so that you can make up for losses from the past.

The second objective is to understand that CESK machines exist for many different kinds of languages. Also, the design and implementation of the CESK machine is the first part of a two-part problem. For the second part, you will write a new store implementation and treat your CESK machine as a client.

Delivery Deliver your solutions in a folder called 11 in your github repo:

        

        11 /

           xrun

           Other/

           ITests/

             [0-9]-in.json

             [0-9]-out.json

A name ending in / is a folder, and indentation means “contained in the closest folder listed above the current name.”

While

The While programming language represents the traditionalist view of programming, which many of you get to know in high school, the last few weeks of Fundamentals II, or your first co-op.

Syntax The While programming language is a statement-oriented language: Array declarations are now like those in Java.

syntax

  

informal meaning

A Stmt is one of:

  

 - [LHS, "=", Expression]

  

 % assignment

 - ["if0", Expression, Stmt, Stmt]

  

 % if

 - ["do0", Expression, Stmt]

  

 % loop while not 0

 - [VarDecl, ..., VarDecl,

  

 % declaration block

    "in"

  

 % set up local variables

     Stmt, ..., Stmt,

  

 % execute statements in order

     Expression]

  

 % its value is the result

  

An Expression is one of:

  

 - Int

  

 % literal constant

 - Var

  

 % the value of a variable

 - [Expression, "+", Expression]

  

 % addition

 - [Expression, "*", Expression]

  

 % multiplication

 - [Expression, Expression]

  

 % the value of an array index

  

A VarDecl has the shape:

  

 - ["let", Var, "=", Expression]

  

 % declare and initialize variable

 - ["vec", Var, "=", [Expression,

  

 % declare array and

                     .., Expression]]

  

 % initial field values

  

A LHS is one of:

  

 % LHS stands for lefthand-side

 - Var

  

 % the location of a variable

 - [Expression, Expression]

  

 % the location of an array index

A While loop program is a block statement.

As always, keywords mentioned in the grammar are not legitimate identifiers: "=", "if0", "do0", "in", "+", "*", "let", "vec" .

Scope A block declares variables one at a time, meaning for example the first declared variable is in scope for the right-hand sides of the second, third, and so forth declaration. All variables are in scope in all Stmts and the Expression part of the block Stmt.

Semantics: Statements The statements are interpreted in the context of an environment and a store as follows:
  • a = evaluates the left-hand side to obtain a location L, the right-hand side to get a value V, and then places V into L.

  • an if0 evaluates the test expression. If the result is a 0, the execution continues with the first sub-Stmt; otherwise the second sub-Stmt is the next control code.

  • an do0 evaluates the test expression. If the result is 0, the loop terminates; otherwise, the program evaluation continues with the sub-Stmt followed by another “copy” of the do0 loop.

  • a block evaluates one variable declaration at a time. A let declaration introduces a plain variable and associates it with the value of the right-hand side. An vec declaration creates an array from the sequence of values on the right-hand side.

    Arrays are values like integers. In particular, an array can be stored in an array giving rise to nested arrays. While they are first-class values just like integers, they cannot be created during the evaluation of an expression. (This restriction isn’t stringent for the While programmer but simplifies your programming tasks.)

    Once all declarations are evaluated, the execution of a block runs through the sequence of sub-Stmts.

    Finally, the result Expression is evaluated. Its value is the result of a tblock. Other than the result of an entire program, these values are ignored.

In the While language, both Expressions and LHSs evaluate without effects and without allocating memory. We thus speak of “atomic” evaluations.

Semantics: Expressions Literal constants, additions, and multiplications have the usual interpretation. As always, the values of variables. The array access form evaluates the left-hand sub-Expression to an array and the right one to an index. Remember that arrays may nest, which is why the array access form consists of two expressions.

Semantics: LHS A variable on the left-hand side of an assignment statement is an ordinary assignment. An array access on the left hand side delivers the store location where the value of the right-hand side of an assignment statement is placed.

Semantics: Store A store consists of locations, each of which can deal with an integer or a tagged integer. In contrast to 24 — Memory, Safety, tagged integers server two purposes here: the representation of locations and the representation of arrays. The former is like cell in the referenced lecture, the latter records how many fields an array has (and that cannot change).

Programming Task

Step 0 Design parser, which turns the internal JSON representation into an AST.

Step 1 Design static_checker, which ensures that all referenced variables are declared.

Step 2 and 3 Design expression_interpreter and lhs_interpreter. In addition to their primary AST argument, both consume two additional arguments: an environment and a store. As their names suggest, they produce values and locations, respectively.

To yield error messages in a deterministic manner, the interpreters evaluate sub-expressions from left to right.

Step 4 Design and implement a CESK machine for the Stmt part of the While language.

Transitions that involve more than one expression evaluate those expression from left to right.

Testing Task Write the test harness xrun for your program. This is step 5. This test harness composes the parser, the static checker, and the CESK machine in the expected manner.

Create ten tests for xrun. Place them in ITests/. A test consists of three two files: N-in.json, N-cps.json, and N-out.json for N in [0 .. 9]:
  • An input file in ITests contains a single While block.

  • An output file in ITests contains a single JSON value, the result of the final expression in the "block" program. If the result is a number, the value is a JSON number. If the result is an array, the value is the content of the array, with nested array values spliced in. During unloading, your machine driver may revisit a location, which means the array is part of a cycle; it should splice in "cycle" in this case.

    The driver test harness may error at three stages: parsing, static checking, and interpretation of Expressions and LHSs. It is impossible for the machine itself to get stuck. The precise messages are: "parser error", "var undeclared", "number expected", "indexing error".

Added After Lecture 24 During development, you may wish to deliver more specific error messages than the above listed:
  • If your parser errors, you may wish to say which piece caused the error.—If by any chance my parser accepts ill-formed While programs, the parser error message will help you get “Matthias credit.”

  • If your interpreter signals an error concerning arrays, there are three different problems that may trigger an “indexing” error. You will want to know during development, which one your interpreter found.

Different languages provide different mechanisms for dealing with such “during development” and “after deployment” changes in behavior; if your language lacks such mechanisms, consider a global flag but don’t forget to set it to the proper value before delivering your final solution.

Recall JSON: Simplicity and Complexity.