[racket] syntax-parse, macros, and literal-sets

From: Carl Eastlund (cce at ccs.neu.edu)
Date: Thu May 30 12:43:36 EDT 2013

On Thu, May 30, 2013 at 12:25 PM, Eric Dobson <eric.n.dobson at gmail.com>wrote:

> Why do literal-sets have to be unhygienic? I expected them to work
> like literal lists which you say are hygienic. Also literal-sets are
> not documented as being unhygienic, which was the confusing part.
>

Literal sets are hygienic in the informal sense of "follow a disciplined
notion of binding structure".  They are not hygienic in the formal sense of
Scheme and Racket expansion wherein names derive their scope from where
they appear in the source.  Specifically, when you use a literal set, you
do not write down the names in the set, yet they are scoped as if you wrote
them right there in your program.  Which is what you want -- to use those
names in the current scope -- but is not hygienic, by the strict
definition.  Anything where you "store" names and pull them out later --
module exports, unit signatures, literal sets, and other similar tools --
is necessarily unhygienic.  Using them in surface syntax is relatively
straightforward; using them in macros is tricky, because these tools have
to introduce the stored names into some lexical context for you to use, and
with a macro that could be either the macro's context or the macro user's
context.


> In my example 2 both the literals in the literal-list and the ids used
> to match the syntax have the same marks, so it is very unintuitive
> that the lexical context of the literal-set name should matter. Also
> can literal sets be used to abstract over identifiers with different
> lexical contexts, intuitively since other forms can do that I would
> expect that they should be able to, but given that its unhygienic I'm
> not so sure now.
>

Because literal-sets can be used in any context, the names used in the
literal set are used in the context of the reference to the literal set.
So if a macro injects the name of the literal set to a syntax-parse
expression, the macro needs to determine whether that literal set is used
in the macro's context or the macro user's context.  Otherwise it could be
difficult to use the names in the macro itself.  Hence the #:at option,
which lets macros specify the context "at" which the literal set is used,
if it's different from the context naming the literal set.

The short, short version: use the #:at option if your macro uses literal
sets "on behalf of" the user.

The syntax-parse documentation could probably be more explicit about how
literals in literal sets derive their context; I do not envy Ryan the task
of writing that up, because I'm not sure how to make it 100% clear myself,
as my ramblings above may show.


> On Thu, May 30, 2013 at 3:49 AM, Ryan Culpepper <ryanc at ccs.neu.edu> wrote:
>  > On 05/29/2013 03:30 AM, Eric Dobson wrote:
> >>
> >> I was writing a macro that generated a literal-set and ran into some
> >> confusing behavior which I have distilled to the following program.
> >>
> >> #lang racket
> >> (require syntax/parse
> >>           (for-syntax syntax/parse))
> >>
> >>
> >> (define-syntax (define-ls1 stx)
> >>    (syntax-parse stx
> >>      ((_ name:id (lit:id ...))
> >>       #'(define-literal-set name (lit ...)))))
> >>
> >> (define-ls1 ls1 (+))
> >> (define-syntax-class sc1
> >>    #:literal-sets (ls1)
> >>    (pattern +))
> >>
> >> (for/list ((x (list #'+ #'*)))
> >>    (syntax-parse x
> >>      (x:sc1 #t)
> >>      (_ #f)))
> >>
> >>
> >> (define-syntax (define-sc2 stx)
> >>    (syntax-parse stx
> >>      ((_ name:id (lit:id ...))
> >>       #'(begin
> >>           (define-literal-set inner (lit ...))
> >>           (define-syntax-class name
> >>             #:literal-sets (inner)
> >>             (pattern lit) ...)))))
> >>
> >> (define-sc2 sc2 (+))
> >> (for/list ((x (list #'+ #'*)))
> >>    (syntax-parse x
> >>      (x:sc2 #t)
> >>      (_ #f)))
> >>
> >> (define-syntax (define-sc3 stx)
> >>    (syntax-parse stx
> >>      ((_ name:id inner:id (lit:id ...))
> >>       #'(begin
> >>           (define-literal-set inner (lit ...))
> >>           (define-syntax-class name
> >>             #:literal-sets (inner)
> >>             (pattern lit) ...)))))
> >>
> >> (define-sc3 sc3 inner3 (+))
> >> (for/list ((x (list #'+ #'*)))
> >>    (syntax-parse x
> >>      (x:sc3 #t)
> >>      (_ #f)))
> >>
> >>
> >> (define-syntax (define-sc4 stx)
> >>    (syntax-parse stx
> >>      ((_ name:id (lit:id ...))
> >>       #'(begin
> >>           (define-literal-set inner (lit ...))
> >>           (define-syntax-class name
> >>             #:literal-sets ((inner #:at name))
> >>             (pattern lit) ...)))))
> >>
> >> (define-sc4 sc4 (+))
> >> (for/list ((x (list #'+ #'*)))
> >>    (syntax-parse x
> >>      (x:sc4 #t)
> >>      (_ #f)))
> >>
> >> This produces the output:
> >> '(#t #f)
> >> '(#t #t)
> >> '(#t #f)
> >> '(#t #f)
> >>
> >> I would have expected the second one to return '(#t #f) like the first
> >> but it doesn't.
> >
> >
> > The issue is how syntax-parse decides whether an identifier in a pattern
> is
> > a pattern variable or a literal. Let's take the simple case, where we
> have
> > just a literals list. From the standpoint of hygiene, the literals list
> is
> > "binding-like" because it sets the interpretation for "references" in the
> > patterns. That means the relevant comparison is bound-identifier=? (like
> all
> > binding forms), and that means that at a minimum if you want an
> identifier
> > in a pattern to be considered a literal, it must have the same marks as
> the
> > corresponding identifier in the literals list. Consider the following
> > program:
> >
> > (define-syntax-rule
> >   (define-syntax-rule/arith (macro . pattern) template)
> >   (define-syntax macro
> >     (syntax-rules (+ - * /)
> >       [(macro . pattern) template])))
> >
> > (define-syntax-rule/arith (sum-left-arg (+ x y)) x)
> >
> > One might expect sum-left-arg to raise a syntax error if given a term
> with
> > something other than + in operator position. But in fact:
> >
> > (sum-left-arg (* 1 2))
> > ;; => 1
> >
> > The reason is because the expansion of define-syntax-rule/arith puts
> marks
> > on the + in the literals list. So only +'s with the same mark in the
> pattern
> > are considered literals; all others are pattern variables. In particular,
> > the + in the pattern in the definition of sum-left-arg is unmarked, so
> it's
> > a pattern variable.
> >
> > Now back to literal sets. A #:literal-sets clause is an essentially like
> an
> > unhygienic binding form. The identifiers it "binds" must be given some
> > lexical context; it defaults to the lexical context of the literal set
> name.
> > In define-sc2 from your example, that is inner, which is introduced by
> the
> > macro and thus has a mark on it. So it's just like you had a literals
> list
> > consisting of a marked +. But the + in the pattern is unmarked, since it
> > comes from the original program (via the lit pattern variable). They
> don't
> > match, so the identifier in the pattern is interpreted as a pattern
> > variable.
> >
> >
> >> The third and fourth are ways to get it to, but I'm
> >>
> >> not sure why they work. The third seems to show that the meaning of
> >> how a literal set binds variables depends on the context of the
> >> literal-set name and not the actual literals.
> >
> >
> > In define-sc3, inner also comes from the original program, so it's
> unmarked,
> > so the literals consist of an unmarked +, which matches the + in the
> > pattern.
> >
> >
> >> The fourth way uses #:at
> >>
> >> which is documented as being useful to macros, but not much else. Can
> >> someone explain how literal-sets are supposed to determine whether or
> >> not a variable in the pattern is treated as a literal or as a pattern
> >> variable?
> >
> >
> > An #:at clause changes the lexical context given to literals. In
> define-sc4,
> > inner is back to being marked, but that doesn't matter, because the
> literals
> > are given the lexical context of the syntax bound to name. In the
> example,
> > that's sc4, which comes from the macro use and is thus unmarked.
> >
> > All unhygienic binding forms have similar difficulties. Macros that
> expand
> > into require forms are another common source of confusion.
> >
> > Ryan
> >
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130530/f56f57da/attachment-0001.html>

Posted on the users mailing list.