[racket] Basic Questions Regarding Macros
Todd Bittner wrote:
> In regards to syntax-rules and syntax-id-rules there is
> 'literal-id' parameter that I don't understand.
There was a thread on this subject recently:
<http://www.mail-archive.com/users@racket-lang.org/msg07145.html>
in which I tried to give an explanation with examples:
<http://www.mail-archive.com/users@racket-lang.org/msg07147.html>
at then I turned it into an annotation to the R6RS standard
(scroll down to the section entitled "How to use literal
arguments"):
<http://marcomaggi.github.com/docs/nausicaa.html/baselib-transformers.html>
> Finally, reading through the reference, #' serves, I believe,
> as shorthand for (syntax),
>From now on I will use a Scheme language compliant with the R6RS
standard. Yes, the following two forms are equivalent:
(syntax (a b c))
#'(a b c)
in the same way as the following two forms are equivalent:
(quote (a b c))
'(a b c)
Notice that it is the source code *reader* (the lexer and
parser) which builds the symbolic expression:
(quote (a b c))
from the sequence of characters:
'(a b c)
and this happens before any library is loaded, so the source code
expander only sees the symbolic expression with the QUOTE symbol
in it. While the following program works because the library
(rnrs) exports the QUOTE identifier:
#!r6rs
(import (rnrs))
(write '(a b c))
the following program will fail:
#!r6rs
(import (except (rnrs) quote))
(write '(a b c))
in exactly the same way as the following program will fail:
#!r6rs
(import (except (rnrs) quote))
(write (quote (a b c)))
because we have explicitly excluded QUOTE from the import set,
showing:
$ racket proof.sps
proof.sps:8:0: compile: unbound identifier in module in: quote
The same happens with the SYNTAX identifier:
#!r6rs
(import (except (rnrs) syntax))
#'(a b c)
showing:
$ racket proof.sps
proof.sps:6:0: compile: unbound identifier in module in: syntax
Summary: once our eyes have adapted to use '--- and #'--- as
abbreviations for (quote ---) and (syntax ---), and it may take a
while, they are of friendly usage; but we have to remember to
import libraries exporting the QUOTE and SYNTAX identifiers.
> but in what practical situations would I then call it?
The SYNTAX identifier is bound to a special, very low level,
macro integrated in the source code expander (which is the
"preprocessor").
It is impossible to fully understand its implementation using
the mental model of a Scheme program being executed at run-time.
In particular: such a syntax *cannot* be implemented using
DEFINE-SYNTAX and it does *not* expand to code using, at
run-time, some technique involving the dynamic environment and
the DYNAMIC-WIND function. Rather, when the expander walking the
code finds a form like:
(syntax . stuff)
it invokes an internal function having access to its internal
data structures (or something to that effect).
The gist of it is: we must use the SYNTAX macro every time we
hand-write the transformer function for a macro, whose use
expands into a form containing identifiers. Example: we do not
need SYNTAX if the macro use expands into a datum, like the
string "ciao":
#!r6rs
(import (rnrs))
(define-syntax ciao
(lambda (stx)
"ciao"))
(write (ciao))
(newline)
but we need SYNTAX if we want to use the binding of the
identifier NEWLINE in the output form:
#!r6rs
(import (rnrs))
(define-syntax return
(lambda (stx)
(syntax (newline))))
(display "first line")
(return)
(display "second line\n")
Notice that we can use the SYNTAX macro to return a datum, too:
#!r6rs
(import (rnrs))
(define-syntax ciao
(lambda (stx)
(syntax "ciao")))
(write (ciao))
(newline)
we can think of:
(syntax "ciao")
as equivalent to "ciao" by itself.
In what follows, I will try to describe the basic mechanics of
SYNTAX omitting a lot about what the expander needs to do its
thing (which is: to conduct the "expansion process"). There is
so much I have to omit that I hope the result still makes some
sense. :)
The expander walks a symbolic expression representing the
source code recursively (somewhat) as shown by the following
program you can run using "racket":
#!r6rs
(import (rnrs))
(define (%log which sexp)
(display which)
(display ": ")
(display sexp)
(newline))
(define (%log-enter sexp)
(%log "Enter" sexp))
(define (%log-exit sexp)
(%log "Exit" sexp))
(define (%log-processing sexp)
(%log "Processing" sexp))
(define (simulate-expand-recursion sexp)
(cond ((list? sexp)
(%log-enter sexp)
(simulate-expand-recursion (car sexp))
(unless (null? (cdr sexp))
(simulate-expand-recursion (cdr sexp)))
(%log-exit sexp))
((symbol? sexp)
(%log-processing sexp))
(else
(%log-processing sexp))))
(simulate-expand-recursion
'(let ((a 1))
(let ((b 2))
(write a)
(write b))))
as you can see from the output it "enters" and "exits" every
subexpression.
From now on I will use pseudo-code, unless otherwise specified.
Internally, the expander constructs a collection data structure
handled somewhat like a stack, let's call it "lexical context";
record values are pushed on this stack. The lexical context is
initialised as follows:
(define-record-type mark
(fields (immutable name)))
(define top-mark
(make-mark "top"))
(define lexical-context
(list top-mark))
(define (push! obj)
(set! lexical-context (cons obj lexical-context)))
(define (pop!)
(set! lexical-context (cdr lexical-context)))
With reference to the symbolic expression:
(let ((a 1))
(let ((b 2))
(write a)
(write b)))
when the expander enters the outer LET it pushes a new mark on
the stack:
(push! (make-mark "outer-let"))
and then it pushes a record representing the binding for the
variable A:
(define-record-type binding
(fields (immutable name)
#| other fields here |#))
(push! (make-binding 'a))
so that the lexical context looks like:
lexical-context => (#<binding name=a>
#<mark name="outer-let">
#<mark name="top">)
When the expander enters the inner LET it pushes a new mark on
the stack, followed by a record representing the binding for B,
so that the lexical context looks like:
lexical-context => (#<binding name=b>
#<mark name="inner-let">
#<binding name=a>
#<mark name="outer-let">
#<mark name="top">)
When the expander finds the reference to A in the form:
(write a)
it searches the lexical context left-to-right for a BINDING
record whose name is A and it finds it: we say that the binding
"captures" the reference.
When the expander exits the inner LET it removes its bindings
and its mark:
lexical-context => (#<binding name=a>
#<mark name="outer-let">
#<mark name="top">)
and when it exits the outer LET it removes its bindings and its
mark:
lexical-context => (#<mark name="top">)
You get the picture: every binding form like LAMBDA, LET, LET*,
... causes the expander to push on the lexical context a new MARK
record followed by BINDING records; these records stay there only
while the expander is processing the corresponding symbolic
expression.
Now enter the SYNTAX macro. It can inspect the lexical context
in search of bindings and other records. Let's look at the
following program:
#!r6rs
(import (rnrs))
(let ((a 1))
(define-syntax reference-to-a
(lambda (stx)
(syntax a)))
(write (reference-to-a))
(newline))
when the expander processes the LET form it pushes a MARK and a
BINDING on the lexical context:
lexical-context => '(#<binding name=a>
#<mark name="let-form">
#<mark name="top">)
let's omit what it does to process the DEFINE-SYNTAX form, what
matters here is that when the macro use:
(reference-to-a)
is expanded, the transformer function:
(lambda (stx) ;STX is ignored here
(syntax a))
is called and the SYNTAX macro searches left-to-right the lexical
context for a BINDING record whose name is A, it finds it and so
it returns what is needed to cause the macro to expand to a
reference to A.
Let's step back because we have omitted an important fact. To
write a program we have to start the source code with an IMPORT
form listing imported libraries, else we can do nothing.
Whenever the expander processes an import set (a set of bindings
exported from imported libraries), it pushes all the bindings on
the lexical context; for example:
#!r6rs
(import (only (rnrs)
write
newline))
causes the bindings for WRITE and NEWLINE to be pushed on the
stack:
lexical-context => (#<binding name=write>
#<binding name=newline>
#<mark name="top">)
and:
#!r6rs
(import (rnrs))
causes all the bindings exported by (rnrs) to be pushed on the
stack:
lexical-context => (#<binding name=display>
#<binding name=write>
#<binding name=newline>
...
#<binding name=sin>
#<binding name=cos>
#<binding name=tan>
...
#<mark name="top">)
too many to be listed. This is why the following program can
work:
#!r6rs
(import (rnrs))
(define-syntax return
(lambda (stx)
(syntax (newline))))
(return)
whenever the expander processes the macro use:
(return)
it calls the transformer function:
(lambda (stx)
(syntax (newline)))
and the SYNTAX macro visits the lexical context in search of a
BINDING record whose name is NEWLINE, it finds it and so it
returns what is needed to cause the macro to expand to a call to
NEWLINE.
Now we can understand why the following program fails:
#!r6rs
(import (rnrs))
(define-syntax hurt-me
(lambda (stx)
(syntax sword)))
(write (hurt-me))
while expanding the macro use:
(hurt-me)
the SYNTAX macro searches the lexical context for a BINDING
record whose name is SWORD, it does not find it and so it causes
the program to abort with:
$ racket proof.sps
proof.sps:9:12: compile: unbound identifier in module in: sword
Now enter the SYNTAX-CASE macro; we have to step back again
because we have omitted another important fact. SYNTAX-CASE
provides two main features:
* It deconstructs the input form of a macro use using pattern
matching. I am not going to describe it here in detail.
* It pushes records of type PATTERN-VARIABLE on the lexical
context, which later can be searched by the SYNTAX macro. We
want to understand this.
Let's look at this program:
#!r6rs
(import (rnrs))
(define-syntax the-second-among
(lambda (stx)
(syntax-case stx ()
((_ ?a ?b ?c)
(syntax ?b)))))
(write (the-second-among 1 2 3))
(newline)
let's fast forward to when the expander processes the macro use:
(the-second-among 1 2 3)
the transformer function:
(define transformer
(lambda (stx)
(syntax-case stx ()
((_ ?a ?b ?c)
(syntax ?b)))))
is applied to a record instance of type SYNTAX-OBJECT:
(define-record-type syntax-object
(fields (immutable sexp)
(immutable current-lexical-context)
#| other fields here |#))
(define stx
(make-syntax-object '(the-second-among 1 2 3)
lexical-context))
(transformer stx)
the use of SYNTAX-CASE:
(syntax-case stx ()
((_ ?a ?b ?c)
(syntax ?b)))
decomposes the symbolic expression:
(the-second-among 1 2 3)
and pushes on the lexical context PATTERN-VARIABLE records as
follows:
(define-record-type pattern-variable
(fields (immutable name)
(immutable sexp)))
(push! (make-pattern-variable '?a 1))
(push! (make-pattern-variable '?b 2))
(push! (make-pattern-variable '?c 3))
so that:
lexical-context => (#<pattern-variable name=?c sexp=3>
#<pattern-variable name=?b sexp=2>
#<pattern-variable name=?a sexp=1>
...
#<mark name="top">)
later the SYNTAX macro visits the lexical context searching
left-to-right for a BINDING record *or* a PATTERN-VARIABLE record
whose name is ?A, it finds it and so it returns what is needed to
cause the macro to expand to the symbolic expression 1.
Got it?
--
Marco Maggi