[racket] phases
Very nice, Jon! I too think it would be great to get this into the docs.
(There are a few typos of filename comments vs 'require's. Also, two small
comments inline below.)
Robby
On Thursday, March 1, 2012, Jon Rafkind wrote:
> Recent problems with phases have led me to investigate how they work in
> more detail. Here is a brief tutorial on what they are and how they work
> with macros. The guide and reference have something to say about phases but
> I don't think they go into enough detail.
>
> Bindings exist in a phase. The link between a binding and its phase is
> represented by an integer. Phase 0 is the phase used for "plain"
> definitions, so
>
> (define x 5)
>
> Will put a binding for 'x' into phase 0. 'x' can be defined at higher
> phases easily
>
> (begin-for-syntax
> (define x 5))
>
> Now 'x' is defined at phase 1. We can easily mix these two definitions in
> the same module, there is no clash between the two x's because they are
> defined at different phases.
>
> (define x 3)
> (begin-for-syntax
> (define x 9))
>
> 'x' at phase 0 has a value of 3 and 'x' at phase 1 has a value of 9.
>
> Syntax objects can refer to these bindings, essentially they capture the
> binding as a value that can be passed around.
>
> #'x
>
> Is a syntax object that represents the 'x' binding. But which 'x' binding?
> In the last example there are two x's, one at phase 0 and one at phase 1.
> Racket will imbue #'x with lexical information for all phases, so the
> answer is both!
>
> Racket knows which 'x' to use when the syntax object is used. I'll use
> eval just for a second to prove a point.
>
> First we bind #'x to a pattern variable so we can use it in a template and
> then just print it.
> (eval (with-syntax ([x #'x])
> #'(printf "~a\n" x)))
>
> This will print 3 because x at phase 0 is bound to 3.
>
> (eval (with-syntax ([x #'x])
> #'(begin-for-syntax
> (printf "~a\n" x))))
>
>
Does this depend on namespace-base-phase? If so, you might want to have a
margin-note saying something to that effect with a pointer.
> This will print 9 because we are using x at phase 1 instead of 0. How does
> Racket know we wanted to use x at phase 1 instead of 0? Because of the
> 'begin-for-syntax'. So you can see that we started with the same syntax
> object, #'x, and was able to use it in two different ways -- at phase 0 and
> at phase 1.
>
> When a syntax object is created its lexical context is immediately set up.
> When a syntax object is provided from a module its lexical context will
> still reference the things that were around in the module it came from.
>
> This module will define 'foo' at phase 0 bound to the value 0 and 'sfoo'
> which binds the syntax object for 'foo'.
>
> ;; a.rkt
> (define foo 0)
> (provide (for-syntax sfoo))
> (define-for-syntax sfoo #'foo)
> ;; why not (define sfoo #'foo) ? I will explain later
>
> ;; b.rkt
> (require "q.rkt")
> (define foo 8)
> (define-syntax (m stx)
> sfoo)
> (m)
>
> The result of the (m) macro will be whatever value 'sfoo' is bound to,
> which is #'foo. The #'foo that 'sfoo' knows that 'foo' is bound from the
> a.rkt module at phase 0. Even though there is another 'foo' in b.rkt this
> will not confuse Racket.
>
> Note that 'sfoo' is bound at phase 1. This is because (m) is a macro so
> its body executes at one phase higher than it was defined at. Since it was
> defined at phase 0 it will execute at phase 1, so any bindings it refers to
> also need to be bound at phase 1.
>
> Now really what I want to show is how bindings can be confused when
> modules are imported at different phases. Racket allows us to import a
> module at an arbitrary phase using require.
>
> (require "a.rkt") ;; import at phase 0
> (require (for-syntax "a.rkt")) ;; import at phase 1
> (require (for-template "a.rkt")) ;; import at phase -1
> (require (for-meta 5 "a.rkt" )) ;; import at phase 5
>
> What does it mean to 'import at phase 1'? Effectively it means that all
> the bindings from that module will have their phase increased by one.
>
> ;; c.rkt
> (define x 0) ;; x is defined at phase 0
>
> ;; d.rkt
> (require (for-syntax "c.rkt"))
>
> Now in d.rkt there will be a binding for 'x' at phase 1 instead of phase 0.
>
> So lets look at a.rkt from above and see what happens if we try to create
> a binding for the #'foo syntax object at phase 0.
>
> ;; a.rkt
> (define foo 0)
> (define sfoo #'foo)
> (provide sfoo)
>
> Now both 'foo' and 'sfoo' are defined at phase 0. The lexical context of
> #'foo will know that there is a binding for 'foo' at phase 0. In fact it
> seems like things are working just fine, if we try to eval sfoo in a.rkt we
> will get 0.
>
> (eval sfoo)
> --> 0
>
> But now lets use sfoo in a macro.
>
> (define-syntax (m stx)
> sfoo)
> (m)
>
> We get an error 'reference to an identifier before its definition: sfoo'.
> Clearly 'sfoo' is not defined at phase 1 so we cannot refer to it inside
> the macro. Lets try to use 'sfoo' in another module by importing a.rkt at
> phase 1. Then we will get 'sfoo' at phase 1.
>
> ;; b.rkt
> (require (for-syntax "a.rkt")) ;; now we have sfoo at phase 1
> (define-syntax (m stx)
> sfoo)
> (m)
>
> $ racket b.rkt
> compile: unbound identifier (and no #%top syntax transformer is bound) in:
> foo
>
> Racket says that 'foo' is unbound now. When 'a.rkt' is imported at phase 1
> we have the following bindings
>
> foo at phase 1
> sfoo at phase 1
>
> So the macro 'm' can see sfoo and will return the #'foo syntax object
> which knows that 'foo' was bound at phase 0.
I think that "knows" is not the right word here-- you should say that the
m transformer returns the 'foo' identifier, whose phase 0 binding is then
used. But in this context, thanks to the +1 of the import, we don't have a
binding for 'foo' at phase 0, only phase 1.
> But there is no 'foo' at phase 0 in b.rkt, there is only a 'foo' at phase
> 1, so we get an error. That is why 'sfoo' needed to be bound at phase 1 in
> a.rkt. In that case we would have had the following bindings after doing
> (require "a.rkt")
>
> foo at phase 0
> sfoo at phase 1
>
> So we can still use 'sfoo' in the macro since its bound at phase 1 and
> when the macro finishes it will refer to a 'foo' binding at phase 0.
>
> If we import a.rkt at phase 1 we can still manage to use 'sfoo'. The trick
> is to create a syntax object that will be evaluated at phase 1 instead of
> 0. We can do that with 'begin-for-syntax'.
>
> ;; a.rkt
> (define foo 0)
> (define sfoo #'foo)
> (provide sfoo)
>
> ;; b.rkt
> (require (for-syntax "a.rkt"))
> (define-syntax (m stx)
> (with-syntax ([x sfoo])
> #'(begin-for-syntax
> (printf "~a\n" x))))
> (m)
>
> b.rkt has 'foo' and 'sfoo' bound at phase 1. The output of the macro will
> be
>
> (begin-for-syntax
> (printf "~a\n" foo))
>
> Because 'sfoo' will turn into 'foo' when the template is expanded. Now
> this expression will work because 'foo' is bound at phase 1.
>
> Now you might try to cheat the phase system by importing a.rkt at both
> phase 0 and phase 1. Then you would have the following bindings
>
> foo at phase 0
> sfoo at phase 0
> foo at phase 1
> sfoo at phase 1
>
> So just using sfoo in a macro should work
>
> ;; b.rkt
> (require "a.rkt"
> (for-syntax "a.rkt"))
> (define-syntax (m stx)
> sfoo)
> (m)
>
> The 'sfoo' inside the 'm' macro comes from the (for-syntax "a.rkt"). For
> this macro to work there must be a 'foo' at phase 0 bound, and there is one
> from the plain "a.rkt" imported at phase 0. But in fact this macro doesn't
> work, it says 'foo' is unbound. The key is that "a.rkt" and (for-syntax
> "a.rkt") are different instantiations of the same module. The 'sfoo' at
> phase 1 only knows that about 'foo' at phase 1, it does not know about the
> 'foo' bound at phase 0 from a different instantiation, even from the same
> file.
>
> So this means that if you have a two functions in a module, one that
> produces a syntax object and one that matches on it (say using
> syntax/parse) the module needs to be imported once at the proper phase. The
> module can't be imported once at phase 0 and again at phase 1 and be
> expected to work.
>
> ;; x.rkt
> #lang racket
>
> (require (for-syntax syntax/parse)
> (for-template racket/base))
>
> (provide (all-defined-out))
>
> (define foo 0)
> (define (make) #'foo)
> (define-syntax (process stx)
> (define-literal-set locals (foo))
> (syntax-parse stx
> [(_ (n (~literal foo))) #'#''ok]))
>
> ;; y.rkt
> #lang racket
>
> (require (for-meta 1 "q6.rkt")
> (for-meta 2 "q6.rkt" racket/base)
> ;; (for-meta 2 racket/base)
> )
>
> (begin-for-syntax
> (define-syntax (m stx)
> (with-syntax ([out (make)])
> #'(process (0 out)))))
>
> (define-syntax (p stx)
> (m))
>
> (p)
>
> $ racket y.rkt
> process: expected the identifier `foo' at: foo in: (process (0 foo))
>
> 'make' is being used in y.rkt at phase 2 and returns the #'foo syntax
> object which knows that foo is bound at phase 0 inside y.rkt, and at phase
> 2 from (for-meta 2 "q6.rkt"). The 'process' macro is imported at phase 1
> from (for-meta 1 "q6.rkt") and knows that foo should be bound at phase 1 so
> when the syntax-parse is executed inside 'process' it is looking for 'foo'
> bound at phase 1 but it sees a phase 2 binding and so doesn't match.
>
> To fix this we can provide 'make' at phase 1 relative to x.rkt and just
> import it at phase 1 in y.rkt
>
> ;; x.rkt
> #lang racket
>
> (require (for-syntax syntax/parse)
> (for-template racket/base))
>
> (provide (all-defined-out))
>
> (define foo 0)
> (provide (for-syntax make))
> (define-for-syntax (make) #'foo)
> (define-syntax (process stx)
> (define-literal-set locals (foo))
> (syntax-parse stx
> [(_ (n (~literal foo))) #'#''ok]))
>
> ;; y.rkt
> #lang racket
>
> (require (for-meta 1 "q6.rkt")
> ;; (for-meta 2 "q6.rkt" racket/base)
> (for-meta 2 racket/base)
> )
>
> (begin-for-syntax
> (define-syntax (m stx)
> (with-syntax ([out (make)])
> #'(process (0 out)))))
>
> (define-syntax (p stx)
> (m))
>
> (p)
>
> $ racket y.rkt
> 'ok
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20120301/d8c504de/attachment-0001.html>