[racket] phases

From: Rüdiger Asche (rac at ruediger-asche.de)
Date: Sun Mar 4 04:44:30 EST 2012

Jon's elaboration along with the paper mentioned here puts many open 
questions in place. Thank you all very much!

Without (yet) having understood the concept in detail, I was wondering if 
it's fair to say that phases add another dimension to computing, denoting 
decision making over different phases of the software cycle  (other 
dimensions being linear code execution over time and bindings along source 
code)?

Elaboration: In my years as a professional C/C++ software designer, I've 
come to see the computer as only one piece of a decision making puzzle. For 
example, the decision over how many iterations a given loop runs may be made 
at

- design time:
  "this software must support a maximum of 8 readers",
- prototyping generation time
- preprocessor time 1:
  #define MAXREADERS 8
  #define READERLOOP (body) for (int iLoop=0; iLoop<MAXREADERS;iLoop++) body
  ...
  READERLOOP({<do something with Reader[iLoop]>});
- preprocessor time 2:
  #define MAXREADERS 8
  ...
  for (iLoop=0;iLoop<MAXREADERS;iLoop++)...) {<do something with 
Reader[iLoop]>}
- compile time:
 #define MAXREADERS 8
  int aMaxReaders = MAXREADERS;
 for (iLoop=0; iLoop< aMaxReaders; iLoop++)
- program startup time:
  int aMaxReaders = ObtainReaderCountFromRegistry();  // one time 
initialization during init phase
  ...
 for (iLoop=0; iLoop< aMaxReaders; iLoop++) {<do something with 
Reader[iLoop]>};
- statically determined just in time:
  int aMaxReaders;
  aMaxReaders = ObtainReaderCountFromRegistry();  // computed right before 
the loop is entered
 for (iLoop=0; iLoop< aMaxReaders; iLoop++) {<do something with 
Reader[iLoop]>};
- dynamically determined just in time:     // in lack of a better term right 
now
  int aMaxReaders;
  aMaxReaders = ObtainReaderCountFromTheInternetRightNow();  // computed 
right before the loop is entered
 for (iLoop=0; iLoop< aMaxReaders; iLoop++) {<do something with 
Reader[iLoop]>};

and so on, where in every phase, the decision is static with respect to the 
current phase but dynamic with respect to the previous phases. I can't 
remember how many times in my career we had to ship a new software because 
some static value had to be made dynamic to a degree (eg from compile time 
to startup time) and then another software for yet another "dynamization" 
(for example from startup to statically just in time). It turns out that 
many decisions one makes every day can be described as "static with respect 
to one phase but dynamic with respect to the previous phase." Lexical vs. 
dynamic scoping is another example. Yet another obvious example is memory 
usage:

unsigned char glbReaderBuffer[MAXREADERMEMORY];   // defined at global 
scope: Allocated by the runtime support library, reserved by loader

vs.

unsigned long DoSomething WithReader()
{
    unsigned char aReaderBuffer[MAXREADERMEMORY];  // defined at procedure 
scope: Allocated "by the compiler," taken from stack
    ...
}

vs.

unsigned long DoSomething WithReader()
{
    unsigned char *aReaderBuffer = malloc(MAXREADERMEMORY);  //  Allocated 
at run time, taken from memory reserved for heap by linker and startup code
    ...
}

and so on. In retrospective, a lot of our daily work consists of 
"dynamizing" (does that word exist? If not, it should! ;-)) static decisions 
made earlier and then cleaning up side effects stemming from that process.

I've fantasized for a long time about making decisions configurable with 
respect to decision time (such as providing a primitive (assign aMaxReaders 
8 @RUNTIME)). A scheme like that could collate design tools, prototyping 
tools, build tools and runtime tools into one. If it could even be smart 
enough to take care of most of the side effects (such as determining when to 
deallocate a chunk of memory allocated in example 3 above when relegating a 
memory decision from compile to run time), I may be out of work soon after 
its conecption...

Anyways, reading over Jons explanation and the paper you mentioned makes it 
appear to me as if the phase architecture is a very definite step towards 
that direction, ptoviding a scheme (pun intended) to incorporate those 
phases in the lifetime of a software project as a parameter to the software 
itself. The thing that intrigues me even more is that it appears to be 
abstract enough to define phases oneself that don't have a 1:1 
correspondence to one "physical" phase. Pretty mind boggling...

Is that a valid way to look at it?

----- Original Message ----- 
From: "Sam Tobin-Hochstadt" <samth at ccs.neu.edu>
To: <rafkind at cs.utah.edu>
Cc: "racket" <users at racket-lang.org>; "Matthias Felleisen" 
<matthias at ccs.neu.edu>
Sent: Thursday, March 01, 2012 10:27 PM
Subject: Re: [racket] phases


http://www.ccs.neu.edu/scheme/pubs/scheme2007-ctf.pdf

On Thu, Mar 1, 2012 at 4:19 PM, Jon Rafkind <rafkind at cs.utah.edu> wrote:
> link
>
> On 03/01/2012 01:53 PM, Sam Tobin-Hochstadt wrote:
>> There's also something of a tutorial on phases in our Scheme Workshop
>> 2007 paper, some of which might be worth adding here. In particular,
>> it has some pictures/diagrams.
>>
>> On Thu, Mar 1, 2012 at 3:52 PM, Matthias Felleisen <matthias at ccs.neu.edu> 
>> wrote:
>>> Nice job. Now polish and add this write-up to the guide. Thanks --  
>>> Matthias
>>>
>>>
>>> On Mar 1, 2012, at 3:31 PM, Jon Rafkind wrote:
>>>
>>>> Recent problems with phases have led me to investigate how they work in 
>>>> more detail. Here is a brief tutorial on what they are and how they 
>>>> work with macros. The guide and reference have something to say about 
>>>> phases but I don't think they go into enough detail.
>>>>
>>>> Bindings exist in a phase. The link between a binding and its phase is 
>>>> represented by an integer. Phase 0 is the phase used for "plain" 
>>>> definitions, so
>>>>
>>>> (define x 5)
>>>>
>>>> Will put a binding for 'x' into phase 0. 'x' can be defined at higher 
>>>> phases easily
>>>>
>>>> (begin-for-syntax
>>>> (define x 5))
>>>>
>>>> Now 'x' is defined at phase 1. We can easily mix these two definitions 
>>>> in the same module, there is no clash between the two x's because they 
>>>> are defined at different phases.
>>>>
>>>> (define x 3)
>>>> (begin-for-syntax
>>>> (define x 9))
>>>>
>>>> 'x' at phase 0 has a value of 3 and 'x' at phase 1 has a value of 9.
>>>>
>>>> Syntax objects can refer to these bindings, essentially they capture 
>>>> the binding as a value that can be passed around.
>>>>
>>>> #'x
>>>>
>>>> Is a syntax object that represents the 'x' binding. But which 'x' 
>>>> binding? In the last example there are two x's, one at phase 0 and one 
>>>> at phase 1. Racket will imbue #'x with lexical information for all 
>>>> phases, so the answer is both!
>>>>
>>>> Racket knows which 'x' to use when the syntax object is used. I'll use 
>>>> eval just for a second to prove a point.
>>>>
>>>> First we bind #'x to a pattern variable so we can use it in a template 
>>>> and then just print it.
>>>> (eval (with-syntax ([x #'x])
>>>> #'(printf "~a\n" x)))
>>>>
>>>> This will print 3 because x at phase 0 is bound to 3.
>>>>
>>>> (eval (with-syntax ([x #'x])
>>>> #'(begin-for-syntax
>>>> (printf "~a\n" x))))
>>>>
>>>> This will print 9 because we are using x at phase 1 instead of 0. How 
>>>> does Racket know we wanted to use x at phase 1 instead of 0? Because of 
>>>> the 'begin-for-syntax'. So you can see that we started with the same 
>>>> syntax object, #'x, and was able to use it in two different ways -- at 
>>>> phase 0 and at phase 1.
>>>>
>>>> When a syntax object is created its lexical context is immediately set 
>>>> up. When a syntax object is provided from a module its lexical context 
>>>> will still reference the things that were around in the module it came 
>>>> from.
>>>>
>>>> This module will define 'foo' at phase 0 bound to the value 0 and 
>>>> 'sfoo' which binds the syntax object for 'foo'.
>>>>
>>>> ;; a.rkt
>>>> (define foo 0)
>>>> (provide (for-syntax sfoo))
>>>> (define-for-syntax sfoo #'foo)
>>>> ;; why not (define sfoo #'foo) ? I will explain later
>>>>
>>>> ;; b.rkt
>>>> (require "q.rkt")
>>>> (define foo 8)
>>>> (define-syntax (m stx)
>>>> sfoo)
>>>> (m)
>>>>
>>>> The result of the (m) macro will be whatever value 'sfoo' is bound to, 
>>>> which is #'foo. The #'foo that 'sfoo' knows that 'foo' is bound from 
>>>> the a.rkt module at phase 0. Even though there is another 'foo' in 
>>>> b.rkt this will not confuse Racket.
>>>>
>>>> Note that 'sfoo' is bound at phase 1. This is because (m) is a macro so 
>>>> its body executes at one phase higher than it was defined at. Since it 
>>>> was defined at phase 0 it will execute at phase 1, so any bindings it 
>>>> refers to also need to be bound at phase 1.
>>>>
>>>> Now really what I want to show is how bindings can be confused when 
>>>> modules are imported at different phases. Racket allows us to import a 
>>>> module at an arbitrary phase using require.
>>>>
>>>> (require "a.rkt") ;; import at phase 0
>>>> (require (for-syntax "a.rkt")) ;; import at phase 1
>>>> (require (for-template "a.rkt")) ;; import at phase -1
>>>> (require (for-meta 5 "a.rkt" )) ;; import at phase 5
>>>>
>>>> What does it mean to 'import at phase 1'? Effectively it means that all 
>>>> the bindings from that module will have their phase increased by one.
>>>>
>>>> ;; c.rkt
>>>> (define x 0) ;; x is defined at phase 0
>>>>
>>>> ;; d.rkt
>>>> (require (for-syntax "c.rkt"))
>>>>
>>>> Now in d.rkt there will be a binding for 'x' at phase 1 instead of 
>>>> phase 0.
>>>>
>>>> So lets look at a.rkt from above and see what happens if we try to 
>>>> create a binding for the #'foo syntax object at phase 0.
>>>>
>>>> ;; a.rkt
>>>> (define foo 0)
>>>> (define sfoo #'foo)
>>>> (provide sfoo)
>>>>
>>>> Now both 'foo' and 'sfoo' are defined at phase 0. The lexical context 
>>>> of #'foo will know that there is a binding for 'foo' at phase 0. In 
>>>> fact it seems like things are working just fine, if we try to eval sfoo 
>>>> in a.rkt we will get 0.
>>>>
>>>> (eval sfoo)
>>>> --> 0
>>>>
>>>> But now lets use sfoo in a macro.
>>>>
>>>> (define-syntax (m stx)
>>>> sfoo)
>>>> (m)
>>>>
>>>> We get an error 'reference to an identifier before its definition: 
>>>> sfoo'. Clearly 'sfoo' is not defined at phase 1 so we cannot refer to 
>>>> it inside the macro. Lets try to use 'sfoo' in another module by 
>>>> importing a.rkt at phase 1. Then we will get 'sfoo' at phase 1.
>>>>
>>>> ;; b.rkt
>>>> (require (for-syntax "a.rkt")) ;; now we have sfoo at phase 1
>>>> (define-syntax (m stx)
>>>> sfoo)
>>>> (m)
>>>>
>>>> $ racket b.rkt
>>>> compile: unbound identifier (and no #%top syntax transformer is bound) 
>>>> in: foo
>>>>
>>>> Racket says that 'foo' is unbound now. When 'a.rkt' is imported at 
>>>> phase 1 we have the following bindings
>>>>
>>>> foo at phase 1
>>>> sfoo at phase 1
>>>>
>>>> So the macro 'm' can see sfoo and will return the #'foo syntax object 
>>>> which knows that 'foo' was bound at phase 0. But there is no 'foo' at 
>>>> phase 0 in b.rkt, there is only a 'foo' at phase 1, so we get an error. 
>>>> That is why 'sfoo' needed to be bound at phase 1 in a.rkt. In that case 
>>>> we would have had the following bindings after doing (require "a.rkt")
>>>>
>>>> foo at phase 0
>>>> sfoo at phase 1
>>>>
>>>> So we can still use 'sfoo' in the macro since its bound at phase 1 and 
>>>> when the macro finishes it will refer to a 'foo' binding at phase 0.
>>>>
>>>> If we import a.rkt at phase 1 we can still manage to use 'sfoo'. The 
>>>> trick is to create a syntax object that will be evaluated at phase 1 
>>>> instead of 0. We can do that with 'begin-for-syntax'.
>>>>
>>>> ;; a.rkt
>>>> (define foo 0)
>>>> (define sfoo #'foo)
>>>> (provide sfoo)
>>>>
>>>> ;; b.rkt
>>>> (require (for-syntax "a.rkt"))
>>>> (define-syntax (m stx)
>>>> (with-syntax ([x sfoo])
>>>> #'(begin-for-syntax
>>>> (printf "~a\n" x))))
>>>> (m)
>>>>
>>>> b.rkt has 'foo' and 'sfoo' bound at phase 1. The output of the macro 
>>>> will be
>>>>
>>>> (begin-for-syntax
>>>> (printf "~a\n" foo))
>>>>
>>>> Because 'sfoo' will turn into 'foo' when the template is expanded. Now 
>>>> this expression will work because 'foo' is bound at phase 1.
>>>>
>>>> Now you might try to cheat the phase system by importing a.rkt at both 
>>>> phase 0 and phase 1. Then you would have the following bindings
>>>>
>>>> foo at phase 0
>>>> sfoo at phase 0
>>>> foo at phase 1
>>>> sfoo at phase 1
>>>>
>>>> So just using sfoo in a macro should work
>>>>
>>>> ;; b.rkt
>>>> (require "a.rkt"
>>>> (for-syntax "a.rkt"))
>>>> (define-syntax (m stx)
>>>> sfoo)
>>>> (m)
>>>>
>>>> The 'sfoo' inside the 'm' macro comes from the (for-syntax "a.rkt"). 
>>>> For this macro to work there must be a 'foo' at phase 0 bound, and 
>>>> there is one from the plain "a.rkt" imported at phase 0. But in fact 
>>>> this macro doesn't work, it says 'foo' is unbound. The key is that 
>>>> "a.rkt" and (for-syntax "a.rkt") are different instantiations of the 
>>>> same module. The 'sfoo' at phase 1 only knows that about 'foo' at phase 
>>>> 1, it does not know about the 'foo' bound at phase 0 from a different 
>>>> instantiation, even from the same file.
>>>>
>>>> So this means that if you have a two functions in a module, one that 
>>>> produces a syntax object and one that matches on it (say using 
>>>> syntax/parse) the module needs to be imported once at the proper phase. 
>>>> The module can't be imported once at phase 0 and again at phase 1 and 
>>>> be expected to work.
>>>>
>>>> ;; x.rkt
>>>> #lang racket
>>>>
>>>> (require (for-syntax syntax/parse)
>>>> (for-template racket/base))
>>>>
>>>> (provide (all-defined-out))
>>>>
>>>> (define foo 0)
>>>> (define (make) #'foo)
>>>> (define-syntax (process stx)
>>>> (define-literal-set locals (foo))
>>>> (syntax-parse stx
>>>> [(_ (n (~literal foo))) #'#''ok]))
>>>>
>>>> ;; y.rkt
>>>> #lang racket
>>>>
>>>> (require (for-meta 1 "q6.rkt")
>>>> (for-meta 2 "q6.rkt" racket/base)
>>>> ;; (for-meta 2 racket/base)
>>>> )
>>>>
>>>> (begin-for-syntax
>>>> (define-syntax (m stx)
>>>> (with-syntax ([out (make)])
>>>> #'(process (0 out)))))
>>>>
>>>> (define-syntax (p stx)
>>>> (m))
>>>>
>>>> (p)
>>>>
>>>> $ racket y.rkt
>>>> process: expected the identifier `foo' at: foo in: (process (0 foo))
>>>>
>>>> 'make' is being used in y.rkt at phase 2 and returns the #'foo syntax 
>>>> object which knows that foo is bound at phase 0 inside y.rkt, and at 
>>>> phase 2 from (for-meta 2 "q6.rkt"). The 'process' macro is imported at 
>>>> phase 1 from (for-meta 1 "q6.rkt") and knows that foo should be bound 
>>>> at phase 1 so when the syntax-parse is executed inside 'process' it is 
>>>> looking for 'foo' bound at phase 1 but it sees a phase 2 binding and so 
>>>> doesn't match.
>>>>
>>>> To fix this we can provide 'make' at phase 1 relative to x.rkt and just 
>>>> import it at phase 1 in y.rkt
>>>>
>>>> ;; x.rkt
>>>> #lang racket
>>>>
>>>> (require (for-syntax syntax/parse)
>>>> (for-template racket/base))
>>>>
>>>> (provide (all-defined-out))
>>>>
>>>> (define foo 0)
>>>> (provide (for-syntax make))
>>>> (define-for-syntax (make) #'foo)
>>>> (define-syntax (process stx)
>>>> (define-literal-set locals (foo))
>>>> (syntax-parse stx
>>>> [(_ (n (~literal foo))) #'#''ok]))
>>>>
>>>> ;; y.rkt
>>>> #lang racket
>>>>
>>>> (require (for-meta 1 "q6.rkt")
>>>> ;; (for-meta 2 "q6.rkt" racket/base)
>>>> (for-meta 2 racket/base)
>>>> )
>>>>
>>>> (begin-for-syntax
>>>> (define-syntax (m stx)
>>>> (with-syntax ([out (make)])
>>>> #'(process (0 out)))))
>>>>
>>>> (define-syntax (p stx)
>>>> (m))
>>>>
>>>> (p)
>>>>
>>>> $ racket y.rkt
>>>> 'ok
>>>> ____________________
>>>> Racket Users list:
>>>> http://lists.racket-lang.org/users
>>>
>>> ____________________
>>> Racket Users list:
>>> http://lists.racket-lang.org/users
>>
>>
>



-- 
sam th
samth at ccs.neu.edu

____________________
  Racket Users list:
  http://lists.racket-lang.org/users


Posted on the users mailing list.