[plt-scheme] Identity of literals in macros

From: Ryan Culpepper (ryan_sml at yahoo.com)
Date: Fri Mar 3 13:31:14 EST 2006

--- Lauri Alanko <la at iki.fi> wrote:

> The mzscheme manual says (12.1):
> 
> > To mesh gracefully with modules, literal identifiers are compared
> with
> > module-identifier=?, which is equivalent to the comparison
> behavior of
> > R5RS in the absence of modules;
> 
> However, I'm beginning to have my doubts about the gracefulness of
> this.
> 
> Granted, it is sortakinda neat that even literal identifiers are
> handled
> by the module system, so they can be e.g. renamed. However, this
> also
> causes problems.
> 
> For instance, a well-known trick is to bind unquote to some
> often-used
> function or syntax. This is especially convenient when it's
> something
> that may need to be added or removed often, since one doesn't need
> to
> edit _both_ ends of an expression. For instance, we can use it to
> add
> tracing to a function easily:
> 
> (define current-indentation (make-parameter 0))
> 
> (define ((trace name f) . args)
>   (define i (current-indentation))
>   (define ind (make-string i #\space))
>   (printf "~a~s ~s -> ...\n" ind name args)
>   (let ((x (parameterize ((current-indentation (+ i 1)))
> 			 (apply f args))))
>     (printf "~a~s ~s -> ~s\n" ind name args x)
>     x))
> 
> (define-syntax unquote
>   (syntax-rules () 
>     ((_ f) (trace 'f f))))
> 
> So if we want to see e.g. how a fib function works, we just add
> some
> commas:
> 
> (define (fib n)
>   (if (< n 2) 
>       n 
>     (+ (,fib (- n 1)) 
>        (,fib (- n 2)))))
> 
> However, now that we bound unquote, we can no longer use it in
> quasiquote:
> 
> > `(one plus two equals ,(+ 1 2))
> (one plus two equals (unquote (+ 1 2)))
> 
> This is because quasiquote expects to get the particular identifier
> that
> is defined in #%kernel (and bound to an error-raising syntax), and
> we
> just shadowed it.
> 
> (Yes, I know I can get `,'-like prefix functionality with custom
> readtables. This is just an example.)
> 
> Another problem is this:
> 
> > (require (lib "contract.ss") (lib "list.ss" "srfi" "1"))
> stdin::0: require: duplicate import identifier at: any 
> in: (require (lib "contract.ss") (lib "list.ss" "srfi" "1"))
> 
> Now, contract.ss doesn't really provide a function or macro with
> the
> name "any", but "any" is a special literal identifier used in
> contract
> expressions, and the identifier is bound (again to an error-raising
> syntax) in the module so that we need to use the "any" in
> contract.ss,
> not just any old "any".
> 
> Even if contract.ss didn't define "any" by itself, then the
> (usually
> unbound) top-level identifier named "any" would be used as the
> literal,
> and improting srfi-1 would shadow that. In any case, we have to
> rename
> one "any" or the other if we want to use contracts and srfi-1 in
> the
> same scope.
> 
> Now, this is pretty much the same sort of a problem as the one I
> mentioned recently when wishing for multiple namespaces: even if we
> have
> two disjoint syntactic contexts, we cannot use the same identifier
> in
> both places for different purposes. But this is an even simpler
> case: we
> don't need a context-specific namespace here, we just want to be
> able to
> use special keywords in macros without having to care about the
> bindings
> in the current scope.
> 
> To my mind, the solution seems to be obvious: forget about
> identifier scopes completely and just look at the raw syntax of
> the expression.  That is, don't compare with module-identifier=?,
> but simply with
> 
> (lambda (i1 i2) (eq? (syntax-object->datum i1) (syntax-object-datum
> i2)))
> 
> Clearly this is considered "ungraceful" in the manual, but I can't
> immediately see why. What would this break?

There are two main ways that people use "recognizable identifiers" in
macros.

The first is as one of many different labels for subforms. For
example, in the unit system, a compound unit expression has import,
link, and export subforms, and they're labeled with the identifiers
"import", "link", and "export". The labels are just labels, though.
You can't put another expression in that position that expands into
an "export" clause.

The second use is for things like "public", "init", "field", etc in
the class system. The class macro expands its subterms until it
recognizes basic expressions, definitions, or these special forms. In
that case, "public" clauses can appear in the same contexts as any
other expression.

My rule of thumb is for "just labels" (the first use), use
literal-identifier=? and don't define the identifier as syntax. When
a subterm of a macro's input can be *either* a normal
expression/definition *or* a new tagged thing (the second use above),
then define the keyword as an error-raising macro and use
module-identifier=? for comparison. And frequently local-expand with
that identifier in the stop list.

By that rule, Scheme's "unquote" needs to be defined as syntax, and
you should not rebind it. Also, "else" should have been defined as
syntax, too, because it occupies the same context as other
expressions. The contract system's "any", though, is perfect the way
it is; it prevents you from defining your own "any" contract that
never gets used because the contract system magically interprets
"any" as something else. Defining "any" makes semantic collisions
more detectable (collision errors, Check Syntax errors).

By the way, you can now use proper keywords (#:something) (the
symbol-like new datatype) for the first purpose, since they were
added to mzscheme a couple versions ago. I haven't seen any examples
of macros written that way yet, but using #: might help serve as a
reminder of which case you're in.

My $.02.

Ryan



Posted on the users mailing list.