[racket] phases
Mr. Rafkin, Scribble this document!
Jay
On 3/1/12, Jon Rafkind <rafkind at cs.utah.edu> wrote:
> Recent problems with phases have led me to investigate how they work in more
> detail. Here is a brief tutorial on what they are and how they work with
> macros. The guide and reference have something to say about phases but I
> don't think they go into enough detail.
>
> Bindings exist in a phase. The link between a binding and its phase is
> represented by an integer. Phase 0 is the phase used for "plain"
> definitions, so
>
> (define x 5)
>
> Will put a binding for 'x' into phase 0. 'x' can be defined at higher phases
> easily
>
> (begin-for-syntax
> (define x 5))
>
> Now 'x' is defined at phase 1. We can easily mix these two definitions in
> the same module, there is no clash between the two x's because they are
> defined at different phases.
>
> (define x 3)
> (begin-for-syntax
> (define x 9))
>
> 'x' at phase 0 has a value of 3 and 'x' at phase 1 has a value of 9.
>
> Syntax objects can refer to these bindings, essentially they capture the
> binding as a value that can be passed around.
>
> #'x
>
> Is a syntax object that represents the 'x' binding. But which 'x' binding?
> In the last example there are two x's, one at phase 0 and one at phase 1.
> Racket will imbue #'x with lexical information for all phases, so the answer
> is both!
>
> Racket knows which 'x' to use when the syntax object is used. I'll use eval
> just for a second to prove a point.
>
> First we bind #'x to a pattern variable so we can use it in a template and
> then just print it.
> (eval (with-syntax ([x #'x])
> #'(printf "~a\n" x)))
>
> This will print 3 because x at phase 0 is bound to 3.
>
> (eval (with-syntax ([x #'x])
> #'(begin-for-syntax
> (printf "~a\n" x))))
>
> This will print 9 because we are using x at phase 1 instead of 0. How does
> Racket know we wanted to use x at phase 1 instead of 0? Because of the
> 'begin-for-syntax'. So you can see that we started with the same syntax
> object, #'x, and was able to use it in two different ways -- at phase 0 and
> at phase 1.
>
> When a syntax object is created its lexical context is immediately set up.
> When a syntax object is provided from a module its lexical context will
> still reference the things that were around in the module it came from.
>
> This module will define 'foo' at phase 0 bound to the value 0 and 'sfoo'
> which binds the syntax object for 'foo'.
>
> ;; a.rkt
> (define foo 0)
> (provide (for-syntax sfoo))
> (define-for-syntax sfoo #'foo)
> ;; why not (define sfoo #'foo) ? I will explain later
>
> ;; b.rkt
> (require "q.rkt")
> (define foo 8)
> (define-syntax (m stx)
> sfoo)
> (m)
>
> The result of the (m) macro will be whatever value 'sfoo' is bound to, which
> is #'foo. The #'foo that 'sfoo' knows that 'foo' is bound from the a.rkt
> module at phase 0. Even though there is another 'foo' in b.rkt this will not
> confuse Racket.
>
> Note that 'sfoo' is bound at phase 1. This is because (m) is a macro so its
> body executes at one phase higher than it was defined at. Since it was
> defined at phase 0 it will execute at phase 1, so any bindings it refers to
> also need to be bound at phase 1.
>
> Now really what I want to show is how bindings can be confused when modules
> are imported at different phases. Racket allows us to import a module at an
> arbitrary phase using require.
>
> (require "a.rkt") ;; import at phase 0
> (require (for-syntax "a.rkt")) ;; import at phase 1
> (require (for-template "a.rkt")) ;; import at phase -1
> (require (for-meta 5 "a.rkt" )) ;; import at phase 5
>
> What does it mean to 'import at phase 1'? Effectively it means that all the
> bindings from that module will have their phase increased by one.
>
> ;; c.rkt
> (define x 0) ;; x is defined at phase 0
>
> ;; d.rkt
> (require (for-syntax "c.rkt"))
>
> Now in d.rkt there will be a binding for 'x' at phase 1 instead of phase 0.
>
> So lets look at a.rkt from above and see what happens if we try to create a
> binding for the #'foo syntax object at phase 0.
>
> ;; a.rkt
> (define foo 0)
> (define sfoo #'foo)
> (provide sfoo)
>
> Now both 'foo' and 'sfoo' are defined at phase 0. The lexical context of
> #'foo will know that there is a binding for 'foo' at phase 0. In fact it
> seems like things are working just fine, if we try to eval sfoo in a.rkt we
> will get 0.
>
> (eval sfoo)
> --> 0
>
> But now lets use sfoo in a macro.
>
> (define-syntax (m stx)
> sfoo)
> (m)
>
> We get an error 'reference to an identifier before its definition: sfoo'.
> Clearly 'sfoo' is not defined at phase 1 so we cannot refer to it inside the
> macro. Lets try to use 'sfoo' in another module by importing a.rkt at phase
> 1. Then we will get 'sfoo' at phase 1.
>
> ;; b.rkt
> (require (for-syntax "a.rkt")) ;; now we have sfoo at phase 1
> (define-syntax (m stx)
> sfoo)
> (m)
>
> $ racket b.rkt
> compile: unbound identifier (and no #%top syntax transformer is bound) in:
> foo
>
> Racket says that 'foo' is unbound now. When 'a.rkt' is imported at phase 1
> we have the following bindings
>
> foo at phase 1
> sfoo at phase 1
>
> So the macro 'm' can see sfoo and will return the #'foo syntax object which
> knows that 'foo' was bound at phase 0. But there is no 'foo' at phase 0 in
> b.rkt, there is only a 'foo' at phase 1, so we get an error. That is why
> 'sfoo' needed to be bound at phase 1 in a.rkt. In that case we would have
> had the following bindings after doing (require "a.rkt")
>
> foo at phase 0
> sfoo at phase 1
>
> So we can still use 'sfoo' in the macro since its bound at phase 1 and when
> the macro finishes it will refer to a 'foo' binding at phase 0.
>
> If we import a.rkt at phase 1 we can still manage to use 'sfoo'. The trick
> is to create a syntax object that will be evaluated at phase 1 instead of 0.
> We can do that with 'begin-for-syntax'.
>
> ;; a.rkt
> (define foo 0)
> (define sfoo #'foo)
> (provide sfoo)
>
> ;; b.rkt
> (require (for-syntax "a.rkt"))
> (define-syntax (m stx)
> (with-syntax ([x sfoo])
> #'(begin-for-syntax
> (printf "~a\n" x))))
> (m)
>
> b.rkt has 'foo' and 'sfoo' bound at phase 1. The output of the macro will be
>
> (begin-for-syntax
> (printf "~a\n" foo))
>
> Because 'sfoo' will turn into 'foo' when the template is expanded. Now this
> expression will work because 'foo' is bound at phase 1.
>
> Now you might try to cheat the phase system by importing a.rkt at both phase
> 0 and phase 1. Then you would have the following bindings
>
> foo at phase 0
> sfoo at phase 0
> foo at phase 1
> sfoo at phase 1
>
> So just using sfoo in a macro should work
>
> ;; b.rkt
> (require "a.rkt"
> (for-syntax "a.rkt"))
> (define-syntax (m stx)
> sfoo)
> (m)
>
> The 'sfoo' inside the 'm' macro comes from the (for-syntax "a.rkt"). For
> this macro to work there must be a 'foo' at phase 0 bound, and there is one
> from the plain "a.rkt" imported at phase 0. But in fact this macro doesn't
> work, it says 'foo' is unbound. The key is that "a.rkt" and (for-syntax
> "a.rkt") are different instantiations of the same module. The 'sfoo' at
> phase 1 only knows that about 'foo' at phase 1, it does not know about the
> 'foo' bound at phase 0 from a different instantiation, even from the same
> file.
>
> So this means that if you have a two functions in a module, one that
> produces a syntax object and one that matches on it (say using syntax/parse)
> the module needs to be imported once at the proper phase. The module can't
> be imported once at phase 0 and again at phase 1 and be expected to work.
>
> ;; x.rkt
> #lang racket
>
> (require (for-syntax syntax/parse)
> (for-template racket/base))
>
> (provide (all-defined-out))
>
> (define foo 0)
> (define (make) #'foo)
> (define-syntax (process stx)
> (define-literal-set locals (foo))
> (syntax-parse stx
> [(_ (n (~literal foo))) #'#''ok]))
>
> ;; y.rkt
> #lang racket
>
> (require (for-meta 1 "q6.rkt")
> (for-meta 2 "q6.rkt" racket/base)
> ;; (for-meta 2 racket/base)
> )
>
> (begin-for-syntax
> (define-syntax (m stx)
> (with-syntax ([out (make)])
> #'(process (0 out)))))
>
> (define-syntax (p stx)
> (m))
>
> (p)
>
> $ racket y.rkt
> process: expected the identifier `foo' at: foo in: (process (0 foo))
>
> 'make' is being used in y.rkt at phase 2 and returns the #'foo syntax object
> which knows that foo is bound at phase 0 inside y.rkt, and at phase 2 from
> (for-meta 2 "q6.rkt"). The 'process' macro is imported at phase 1 from
> (for-meta 1 "q6.rkt") and knows that foo should be bound at phase 1 so when
> the syntax-parse is executed inside 'process' it is looking for 'foo' bound
> at phase 1 but it sees a phase 2 binding and so doesn't match.
>
> To fix this we can provide 'make' at phase 1 relative to x.rkt and just
> import it at phase 1 in y.rkt
>
> ;; x.rkt
> #lang racket
>
> (require (for-syntax syntax/parse)
> (for-template racket/base))
>
> (provide (all-defined-out))
>
> (define foo 0)
> (provide (for-syntax make))
> (define-for-syntax (make) #'foo)
> (define-syntax (process stx)
> (define-literal-set locals (foo))
> (syntax-parse stx
> [(_ (n (~literal foo))) #'#''ok]))
>
> ;; y.rkt
> #lang racket
>
> (require (for-meta 1 "q6.rkt")
> ;; (for-meta 2 "q6.rkt" racket/base)
> (for-meta 2 racket/base)
> )
>
> (begin-for-syntax
> (define-syntax (m stx)
> (with-syntax ([out (make)])
> #'(process (0 out)))))
>
> (define-syntax (p stx)
> (m))
>
> (p)
>
> $ racket y.rkt
> 'ok
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users
>
--
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay
"The glory of God is Intelligence" - D&C 93