4 Syntax Pattern
4.1 The Basics of Syntax Pattern Matching
The macro and compile-time functions in the preceding chapter use syntax-e and primitive Racket functions to de-structure the given syntax nodes. Once they have the pieces, they compose the new syntax node using #’ and friends. We know from defining run-time functions in programs that this sequence of de-structuring a given compound form of data and constructing new data naturally leads to repeated programming patters. In the case of run-time functions, we use match and similar facilities to eliminate these patterns, greatly enhancing the readability of code.
As a matter of fact, Racket supplies several libraries for defining macros and compile-time functions, both more primitive ones than syntax-parse and derived forms.
Unsurprisingly, Racket provides a similarly powerful sub-language for defining macros and compile-time functions, namely, the language of syntax-parse. It is an embedded sub-language, defined via primitive macro facilities, a fact that we ignore here. The construct is tuned to help with syntax-processing functions.
; (define-hello-v2 name) is like define-hello
(define-syntax (define-hello-v2 stx) (syntax-parse stx ((_ x) #'(define x "world"))))
If this doesn’t work for you, you need to (require (for-syntax syntax/parse)) to your Definitions Window.
The pattern is (_ x), matching a syntax node that contains a two-element list. The first element is the new keyword, define-hello-v2, and the pattern emphasizes this with _. The second element of the pattern is x, a pattern variable that matches any sub-tree of the given syntax node in this position.
For example, if the given syntax node contains (define-hello-v2 a), then the syntax-pattern variable x is instantiated as the syntax object containing a.
The template is #'(define x "world") (short for (syntax (define x "world"))), which uses Racket’s define and the string "world" to create a definition from the syntax-pattern variable x. Once x is instantiated due to a successful match, the expander substitutes the x in the template with its value. We say the template gets instantiated.
For example, if x stands for a, then the result of instantiating the template is (define a "world").
> (define-hello-v2 a) > a "world"
> (define-hello-v2 a-variable) > a-variable "world"
; like (define-hello-v2 name) but expresses errors in terms of itself
(define-syntax (define-hello-v3 stx) (syntax-parse stx ((_ (~var x id)) #'(define x "world"))))
> (define-hello-v3 1)
define-hello-v3: expected identifier
at: 1
in: (define-hello-v3 1)
> (require rackunit) > (require syntax/macro-testing)
> (check-exn #rx"define-hello-v3: expected identifier" (lambda () (convert-syntax-error (define-hello-v3 1))))
Sample Problem Suppose we want define-hello to define more than one name to stand for "world", this most amazing string of all.
; (define-hello* name ...) defines every name ... to stand for a "world" (define-syntax (define-hello* stx) (syntax-parse stx [(_ (~var x id) ...) #'???]))
In the context of a pattern, ... means “matches the term to my immediate left repeated 0 or more times.”
Here (~var x id) is to the immediate left, that is, a syntax-pattern variable with an annotation.
Thus the pattern matches any syntax list that starts with define-hello* followed by 0 or more identifiers. As far as the pattern is concerned, the syntax-pattern variable stands for the entire sequence of identifiers.
For example, it matches (define-hello*), (define-hello* a), and (define-hello* aa a). In these three cases, the syntax-pattern variable x stands for the empty sequence of identifiers, the sequence that contains only a, and a two-identifier sequence. Stop! Which are the two identifiers?
(define-syntax (define-hello* stx) (syntax-parse stx [(_ (~var x id) ...) #'(begin (define x "world") ...)]))
In the context of a template, ... means “instantiate the sub-template to my immediate left with the sequence found in the syntax-pattern variable contained in this term.”
Here (define x "world") is the template sub-term to the immediate left of .... This sub-template contains x, which indeed stands for a sequence of identifiers.
The macro expander instantiates the sub-template for every element of the sequence that the syntax-pattern variable stands for.
For example, if x stands for the two-element sequence of identifiers aa and a, then the resulting sequence of terms is (define aa "world") and (define a "world").
> (define-hello* aa a) > aa "world"
> a "world"
Exercise 3. One alternative way to create definitions for several identifiers is to use define-values:
(define-syntax (define-hello*-v2 stx) (syntax-parse stx [(_ (~var x id) ...) #'(define-values (x ...) (values (begin 'x "world") ...))])) Explain whether and how the term (define-hello* world good bye) matches the pattern. Then step through the instantiation process for the template. Confirm your explanation in DrRacket. Rationalize why this implementation works. End
Sample Problem In principle we can use define-hello* without supplying any identifiers, like this: (define-hello*). Using the macro in this way doesn’t make much sense, though.
So the question is how we can demand that every use supplies at least one identifier. Let’s call the revised macro define-hello+.
> (define-syntax (define-hello+-v1 stx) (syntax-parse stx [(_ (~var x id) ...+) #'(begin (define x "world") ...)])) > (define-hello+-v1 b) > b "world"
> (define-syntax (define-hello+-v2 stx) (syntax-parse stx [(_ (~var x-1 id) (~var x-2 id) ...) #'(begin (define x-1 "world") (define x-2 "world") ...)])) > (define-hello+-v2 c) > (define-hello+-v2 d e) > (list c d e) '("world" "world" "world")
Stop! Try to use both variants of define-hello+ without supplying an identifier. If you are using DrRacket, turn on the on-line syntax checker and watch the error messages in the status line near the bottom of the window.
4.2 More Pattern Matching, More Templating
Racket’s match supports far more than simple patterns and one-clause pattern-matching, and so does syntax-parse. The point of the next few little exercises is to start expanding your knowledge of syntax-parse’s pattern-matching and templating facilities.
Sample Problem Let us revise define-hello+ so that it allows the optional specification of string prefixes. Specifically, the revised macro should allow the optional prelude of a clause that looks like (pre s) for the literal identifier pre and any string s.
> (define-syntax (define-hello+-v3 stx) (syntax-parse stx [(_ (~var x id) ...+) #'(begin (define x "world") ...)] [(_ ((~literal pre) (~var p str)) (~var x id) ...+) #'(begin (define x (string-append p "world")) ...)])) > (define-hello+-v3 (pre "hello ") c d) > c "hello world"
> d "hello world"
> (define-hello+-v3 e) > e "world"
> (define-syntax (define-hello+-v4 stx) (syntax-parse stx [(_ (~optional ((~literal pre) (~var p str))) (~var x id) ...+) #'(begin (define x (string-append (~? p "") "world")) ...)]))
(~? p default)
> (define-hello+-v4 (pre "good ") d) > d "good world"
> (define-hello+-v4 e) > e "world"
4.3 Yet More Pattern Matching, Yet More Templating
There is still much more to learn about macros and syntax-parse. Macros can be recursive, and syntax-parse may compute the resulting code in a procedural manner, not just by instantiating a template. We’ll introduce these ideas by revising our sample macro once again.
Sample Problem Revise the define-hello macro so that it permits the optional postfixing of each string. That is, every individual identifier will still be defined to stand for "world", but an identifier paired with a string in parentheses stands for "world" postfixed with this string. The revised macro, called define-hello-post allows empty sequences of sub-terms and does not accommodate optional prefixes.
This time we show three solutions. The goal is to bring across different techniques. The first and older one solves the problem using a recursive macro. The second one uses plain compile-time syntax processing to generate some of the pieces of the desired result, mixing the procedural techniques of the preceding chapter and the templating style of this one. The last one simplifies the second one by using additional syntax-parse, and it is the most direct approach.
(define-hello-post) ; is equivalent to (begin) (define-hello-post a b) ; defines a and b to stand for "world" (define-hello-post (c " bye")) ; defines c as "world bye" (define-hello-post g [f ", hello"] h [i "---done"]) ; defines g and h to stand for "world", ; f as "world, hello", and i as "world---done"
If define-hello-post were a function and its sub-terms were a list, we would write a recursive functions that iterates through the terms until the list is exhausted. Depending on the shape of the first term in the list, the function would compute a different result.
> (define-syntax (define-hello-post stx) (syntax-parse stx [(_ ((~var x id) (~var p str)) others ...) #'(begin (define x (string-append "world" p)) (define-hello-post others ...))] [(_ (~var x id) others ...) #'(begin (define x "world") (define-hello-post others ...))] [(_) #'(begin)]))
(define-hello-post (x "done")) (define-hello-post (x "done") y) (define-hello-post (x "done") y (z "wow") w u v)
Now that we understand the patterns, we can turn to the templates. When the leading term consists of a pair of an identifier and a string p, the macro must generate a definition for this variable that initializes it to (string-append "world" p);. Similarly, when the leading term is just an identifier, the macro defines it to stand for "world". In both cases, the pattern also mentions others, the sequence of remaining sub-terms. To deal with these sub-terms, the macro generates another instance of itself: (define-hello-post others ...).
Stop! Why can a macro produce code that uses itself?
Now that you know this much, you also understand that a Racket programmer has the power to make the compiler diverge. Write a short macro that makes the Racket compiler run forever.
> (define-hello-post g [f ", hello"] h) > (list g f h) '("world" "world, hello" "world")
For the second solution, we start with a brief detour to drive home the nature of syntax-parse. Thus far, we have acted as if syntax-parse had to have a syntax template on the right-hand side of its clauses. But, this isn’t the case; syntax-parse is like any other conditionals, meaning we can place any expression there. For a macro, the result of this expression must be a syntax object. When syntax-parse is used somewhere else, say in for/list, a syntax-parse conditional may return anything.
(define-syntax (define-hello-post-v1 stx) (syntax-parse stx [(_ x-or-x+post ...) (define xs+ps (for/list ((one (syntax-e #'(x-or-x+post ...)))) (syntax-parse one [(~var x id) (list #'x #'"")] [((~var x id) (~var p str)) (list #'x #'p)]))) #`(begin #,@(for/list ((y+q xs+ps)) (define y (first y+q)) (define q (second y+q)) #`(define #,y (string-append "world" #,q))))]))
The local definition names a list of pairs that combine an identifier from the macro’s sub-terms plus a string to be appended to "world". It computes this list by iterating over the list of the macro’s sub-terms. Note how this for/list loop uses syntax-parse to analyze the term. If it is just an identifier, it forms a list of the identifier and the (syntax of) the empty string; otherwise it forms a pair of the given identifier and string.
The syntax template generates the desired list of definitions from the computed list of pairs and splices it into a begin expressions. This inner loop effectively simulates an ellipsis in a template. It is needed because the list of pairs isn’t one formed from an ellipsis pattern. The body of the loop takes apart the pair, naming the identifier y and its post string q. Each iteration of the loop produces a single definition with another quasisyntax template.
Stop! This macro definition is by far the complicated one you have seen so far. Make sure to experiment with it. Use it in all the ways that define-hello-post is intended to be used.
> (define-for-syntax (fill-in-option x-or-x+post) (syntax-parse x-or-x+post [(~var x id) (list #'x #'"")] [((~var x id) (~var p str)) (list #'x #'p)]))
The second step is about turning this list of pairs into pattern variables as if they had been a part of an ellipsis pattern. To this end, we need to introduce another one of syntax-parse’s facilities, #:with. Roughly speaking, #:with matches a syntax pattern with any value. If the match succeeds, it introduces new syntax-pattern variables that may be used in a syntax template; otherwise the syntax-parse clause acts as if its pattern had failed.
> (define-syntax (define-hello-post-v2 stx) (syntax-parse stx [(_ x-or-x+post ...) #:with ((x p) ...) (map fill-in-option (syntax-e #'(x-or-x+post ...))) #'(begin (define x (string-append "world" p)) ...)]))
The rest is straightforward. The template is an ellipsis sequence of defines within begin. An ellipsis in a template expects a syntax-pattern variable in the template to its immediate left so that the expander can instantiate into a sequence of syntax objects. In this example, there are two: x and p, and they represent sequences of the same length. The result then is the expected sequence of definitions.
> (define-hello-post-v2 i [j ", hello"] k) > (list i j k) '("world" "world, hello" "world")
Exercise 4. The three solutions to our current sample problem generates different definitions sequences. Show the sequences of definitions that
(define-hello-post one [two ", 2"]),
(define-hello-post-v1 one [two ", 2"]), and
(define-hello-post-v2 one [two ", 2"])
generate. If string-append were an expensive operation, the first solution might look more efficient to you now. Can you modify one of the other two solutions so that it generates the same sequence as the first one? End
One Last Thought
(define-hello-post-v2 m m)
identifier already defined at: m in: (define-values (m) (string-append "world" ""))
You might think that you have seen enough variations on the define-hello theme by now, and you are ready for something, anything interesting. Let’s go there but keep this problem in mind because we will have to return to this idea one more time to get things right. In the meantime, let’s get started on “good stuff.”