[plt-scheme] Breaking hygiene

From: Matt Jadud (mcj4 at kent.ac.uk)
Date: Wed Apr 13 09:41:28 EDT 2005

> Matt, can you tell us more concretely what your macro is supposed to
> do?

I was playing with grammars and units, and thought "Hey--maybe that's a 
macro." So, here we are. This may get a touch long; my apologies.

If I have a grammar like

<expr> -> (introduce (<id> ...) <expr>)
        | (bind <id> <num>)
        | (<op> <expr> <expr>)
        | <id>
<id>   -> symbol?
<num>  -> number?
<op>   -> +, -

(That's ugly, but it will do.)

If this was expressed as S-expressions, I could write a compiler pass 
that handles this grammar using match:

(require (lib "plt-match.ss"))
(define s-exp-pass
   (let ()

     (define <expr>
       (match-lambda
         [`(introduce (,id* ...) ,expr* ...)
           `(let ,@(map (lambda (i)
                          `(,(<id> i) (void))) id*)
              ,@(map <expr> expr*))]
         [`(bind ,id ,(? number? num))
           `(set! ,(<id> id) ,num)]
         [`(,(? (lambda (s)
                 (member s '(+ -))) op)
             ,e1 ,e2)
           `(,op ,(<expr> e1) ,(<expr> e2))]
         [(? <id> id) id]
         ))

     (define <id>
       (lambda (i)
         (if (symbol? i) i (error '<id> "Not an ID: ~s" i))))

     (lambda (e)
       (<expr> e)))

(This is an example; I'm more interested in passes operating over 
structures that perform a sequence of transformations on an AST, but 
this will do for purposes of discussion)

While the code does follow from the grammar (I took a liberty or two in 
my guards to speed things up), I can't reuse productions of this grammar 
in other passes. I could break the production rules out, but then I have 
naming to keep track of and a tricky bit of functional abstraction to 
pass the sub-productionns around.

On Sunday, I sat down and decided that units were not magic, and found 
that they could be used to represent production rules. The above 
transforms into

(require (lib "unitsig.ss")
          (lib "plt-match.ss"))
(define-signature expr^ (<expr>))
(define-signature id^ (<id>))

(define s-exp-pass:expr@
   (unit/sig expr^
     (import id^)
     (define <expr>
       (match-lambda
         [`(introduce (,id* ...) ,expr* ...)
           `(let ,@(map (lambda (i)
                          `(,(<id> i) (void))) id*)
              ,@(map <expr> expr*))]
         [`(bind ,id ,(? number? num))
           `(set! ,(<id> id) ,num)]
         [`(,(? (lambda (s)
                 (member s '(+ -))) op)
             ,e1 ,e2)
           `(,op ,(<expr> e1) ,(<expr> e2))]
         [(? <id> id) id]
         ))))


(define s-exp-pass:id@
   (unit/sig id^
     (import)
     (define <id>
       (lambda (i)
         (if (symbol? i) i (error '<id> "Not an ID: ~s" i))))
     ))

and the pass becomes

(define s-exp-pass@
   (compound-unit/sig
     (import)
     (link
      (EXPR : expr^ (s-exp-pass:expr@ ID))
      (ID   : id^   (s-exp-pass:id@)))
     (export (var (EXPR <expr>)))))

(define s-exp-pass
     (let () (define-values/invoke-unit/sig
               (<expr>) s-exp-pass@) <expr>))

which is a bit more verbose in some ways, but I achieve real flexibility 
between passes. Now, I can write another pass that only introduces a 
unit for the <expr> production rule, or the <id> production. New passes 
can, generally speaking, reuse production rules from any other pass.

Whereas the "micro-pass" approach we used at IUB took a copy-and-paste 
approach to defining passes (they were generally written like the first 
example), unitizing things eliminates some of that redundancy. Granted, 
I'd like to play with their new macro system foor this, but it isn't 
available at the moment.

So, I thought it would be nice if I could say something like:

(define-pass s-exp-passM
   (export expr)
   (new
    [expr            ;;<-- and I'd like this to expand
     (id)            ;;    to a definition for <expr>...
     (match-lambda
       [`(introduce (,id* ...) ,expr* ...)
         `(let ,@(map (lambda (i)
                        `(,(<id> i) (void))) id*)
            ,@(map <expr> expr*))]
       [`(bind ,id ,(? number? num))
         `(set! ,(<id> id) ,num)]
       [`(,(? (lambda (s)
                (member s '(+ -))) op)
           ,e1 ,e2)
         `(,op ,(<expr> e1) ,(<expr> e2))]
       [(? <id> id) id])]
    [id
     ()
     (lambda (i)
         (if (symbol? i) i (error '<id> "Not an ID: ~s" i)))]
    ))

which would expand with all kinds of nifty renaming and whatnot.

(define-pass some-other-pass
   (export expr)
   (reuse
    [expr s-exp-pass:expr (id)]
    [id   s-exp-pass:id   ()]))

would expand to the appropriate compound-unit, allowing me to reuse 
production rules defined in other passes. Most importantly, I was 
looking forward to allowing passes that have both a (new ...) and (reuse 
...) part, so I can introduce a new version of just one production rule 
in a compiler pass.

I managed to get everything working... except for the recursive calls 
within a production. That is, productions that don't refer to themselves 
work just fine, as they end up getting captured by the enclosing 
units/signatures. I was quite pleased with myself until I tried a more 
complex example, and discovered my hygiene problems...

 > (s-exp-pass '(introduce (x y) (bind x 1) (bind y 2) (+ x y)))
(let (x (void)) (y (void)) (set! x 1) (set! y 2) (+ x y))

 > (s-exp-passM '(introduce (x y) (bind x 1) (bind y 2) (+ x y)))
. reference to undefined identifier: <expr>

What I can't decide, however, is whether the 'unitized' breakout of 
production rules is any different than an object-oriented breakdown in 
the absence of types. At least, something at the back of my mind is 
telling me that this whole thing isn't any different than representing 
the productions as classes in a dynamically typed world, but I need to 
think about that more.

So, that's the long of it. The specific bit of macro that inspired the 
question was:

#,@(srfi:map-in-order (lambda (unit-name signature-name 
import-signatures production-name body)
                                   #`(define #,(+pass-name (syntax 
pass-name) unit-name)
                                       (unit/sig #,signature-name
                                         (import #, at import-signatures)
                                         (define #,production-name ;;<--
                                           #,body)                 ;;<--

                                         )))
                                 unit-names
                                 signature-names
                                 import-signatures
                                 (srfi:map-in-order syntax-object->datum 
production-names)
                                 (syntax->list (syntax (body* ...))))


where the attempted hygiene jail-break is marked. :)

Of course, if I wrap the output of the whole macro in

#`(pretty (quote (begin ....)))

and I copy-and-paste the output into a new window and eval it, it works 
great. But, I don't think that's the point, is it. :)

Thanks,
Matt



Posted on the users mailing list.