[racket] syntax-parse, macros, and literal-sets

From: Eric Dobson (eric.n.dobson at gmail.com)
Date: Fri May 31 02:30:47 EDT 2013

Ok given the complexities of literal-sets and hygiene, I think I will
avoid them if I can. The issue is that I cannot seem to get the same
behaviour from literal lists as I do from literal-sets. Literal sets
will give an error an expansion time if there is not a binding for a
specified literal, but literal lists do not do this even when I
specify a phase for the binding.

#lang racket

(require syntax/parse)
(require (for-template (only-in racket/base list)))

(syntax-parse #'list
   #:literals ((list list #:phase -1))
   (list #f))

;; Raises the error at runtime
(syntax-parse #'list
   #:literals ((map map #:phase -1))
   (map #f))

(define-literal-set literals1 #:for-template (list))

;; Gives the error at compile time if uncommented
;(define-literal-set literals2 #:for-template (map))

I'm not sure I understand the reason for this limitation.


On Thu, May 30, 2013 at 9:43 AM, Carl Eastlund <cce at ccs.neu.edu> wrote:
> On Thu, May 30, 2013 at 12:25 PM, Eric Dobson <eric.n.dobson at gmail.com>
> wrote:
>>
>> Why do literal-sets have to be unhygienic? I expected them to work
>> like literal lists which you say are hygienic. Also literal-sets are
>> not documented as being unhygienic, which was the confusing part.
>
>
> Literal sets are hygienic in the informal sense of "follow a disciplined
> notion of binding structure".  They are not hygienic in the formal sense of
> Scheme and Racket expansion wherein names derive their scope from where they
> appear in the source.  Specifically, when you use a literal set, you do not
> write down the names in the set, yet they are scoped as if you wrote them
> right there in your program.  Which is what you want -- to use those names
> in the current scope -- but is not hygienic, by the strict definition.
> Anything where you "store" names and pull them out later -- module exports,
> unit signatures, literal sets, and other similar tools -- is necessarily
> unhygienic.  Using them in surface syntax is relatively straightforward;
> using them in macros is tricky, because these tools have to introduce the
> stored names into some lexical context for you to use, and with a macro that
> could be either the macro's context or the macro user's context.
>
>>
>> In my example 2 both the literals in the literal-list and the ids used
>> to match the syntax have the same marks, so it is very unintuitive
>> that the lexical context of the literal-set name should matter. Also
>> can literal sets be used to abstract over identifiers with different
>> lexical contexts, intuitively since other forms can do that I would
>> expect that they should be able to, but given that its unhygienic I'm
>> not so sure now.
>
>
> Because literal-sets can be used in any context, the names used in the
> literal set are used in the context of the reference to the literal set.  So
> if a macro injects the name of the literal set to a syntax-parse expression,
> the macro needs to determine whether that literal set is used in the macro's
> context or the macro user's context.  Otherwise it could be difficult to use
> the names in the macro itself.  Hence the #:at option, which lets macros
> specify the context "at" which the literal set is used, if it's different
> from the context naming the literal set.
>
> The short, short version: use the #:at option if your macro uses literal
> sets "on behalf of" the user.
>
> The syntax-parse documentation could probably be more explicit about how
> literals in literal sets derive their context; I do not envy Ryan the task
> of writing that up, because I'm not sure how to make it 100% clear myself,
> as my ramblings above may show.
>
>>
>> On Thu, May 30, 2013 at 3:49 AM, Ryan Culpepper <ryanc at ccs.neu.edu> wrote:
>> > On 05/29/2013 03:30 AM, Eric Dobson wrote:
>> >>
>> >> I was writing a macro that generated a literal-set and ran into some
>> >> confusing behavior which I have distilled to the following program.
>> >>
>> >> #lang racket
>> >> (require syntax/parse
>> >>           (for-syntax syntax/parse))
>> >>
>> >>
>> >> (define-syntax (define-ls1 stx)
>> >>    (syntax-parse stx
>> >>      ((_ name:id (lit:id ...))
>> >>       #'(define-literal-set name (lit ...)))))
>> >>
>> >> (define-ls1 ls1 (+))
>> >> (define-syntax-class sc1
>> >>    #:literal-sets (ls1)
>> >>    (pattern +))
>> >>
>> >> (for/list ((x (list #'+ #'*)))
>> >>    (syntax-parse x
>> >>      (x:sc1 #t)
>> >>      (_ #f)))
>> >>
>> >>
>> >> (define-syntax (define-sc2 stx)
>> >>    (syntax-parse stx
>> >>      ((_ name:id (lit:id ...))
>> >>       #'(begin
>> >>           (define-literal-set inner (lit ...))
>> >>           (define-syntax-class name
>> >>             #:literal-sets (inner)
>> >>             (pattern lit) ...)))))
>> >>
>> >> (define-sc2 sc2 (+))
>> >> (for/list ((x (list #'+ #'*)))
>> >>    (syntax-parse x
>> >>      (x:sc2 #t)
>> >>      (_ #f)))
>> >>
>> >> (define-syntax (define-sc3 stx)
>> >>    (syntax-parse stx
>> >>      ((_ name:id inner:id (lit:id ...))
>> >>       #'(begin
>> >>           (define-literal-set inner (lit ...))
>> >>           (define-syntax-class name
>> >>             #:literal-sets (inner)
>> >>             (pattern lit) ...)))))
>> >>
>> >> (define-sc3 sc3 inner3 (+))
>> >> (for/list ((x (list #'+ #'*)))
>> >>    (syntax-parse x
>> >>      (x:sc3 #t)
>> >>      (_ #f)))
>> >>
>> >>
>> >> (define-syntax (define-sc4 stx)
>> >>    (syntax-parse stx
>> >>      ((_ name:id (lit:id ...))
>> >>       #'(begin
>> >>           (define-literal-set inner (lit ...))
>> >>           (define-syntax-class name
>> >>             #:literal-sets ((inner #:at name))
>> >>             (pattern lit) ...)))))
>> >>
>> >> (define-sc4 sc4 (+))
>> >> (for/list ((x (list #'+ #'*)))
>> >>    (syntax-parse x
>> >>      (x:sc4 #t)
>> >>      (_ #f)))
>> >>
>> >> This produces the output:
>> >> '(#t #f)
>> >> '(#t #t)
>> >> '(#t #f)
>> >> '(#t #f)
>> >>
>> >> I would have expected the second one to return '(#t #f) like the first
>> >> but it doesn't.
>> >
>> >
>> > The issue is how syntax-parse decides whether an identifier in a pattern
>> > is
>> > a pattern variable or a literal. Let's take the simple case, where we
>> > have
>> > just a literals list. From the standpoint of hygiene, the literals list
>> > is
>> > "binding-like" because it sets the interpretation for "references" in
>> > the
>> > patterns. That means the relevant comparison is bound-identifier=? (like
>> > all
>> > binding forms), and that means that at a minimum if you want an
>> > identifier
>> > in a pattern to be considered a literal, it must have the same marks as
>> > the
>> > corresponding identifier in the literals list. Consider the following
>> > program:
>> >
>> > (define-syntax-rule
>> >   (define-syntax-rule/arith (macro . pattern) template)
>> >   (define-syntax macro
>> >     (syntax-rules (+ - * /)
>> >       [(macro . pattern) template])))
>> >
>> > (define-syntax-rule/arith (sum-left-arg (+ x y)) x)
>> >
>> > One might expect sum-left-arg to raise a syntax error if given a term
>> > with
>> > something other than + in operator position. But in fact:
>> >
>> > (sum-left-arg (* 1 2))
>> > ;; => 1
>> >
>> > The reason is because the expansion of define-syntax-rule/arith puts
>> > marks
>> > on the + in the literals list. So only +'s with the same mark in the
>> > pattern
>> > are considered literals; all others are pattern variables. In
>> > particular,
>> > the + in the pattern in the definition of sum-left-arg is unmarked, so
>> > it's
>> > a pattern variable.
>> >
>> > Now back to literal sets. A #:literal-sets clause is an essentially like
>> > an
>> > unhygienic binding form. The identifiers it "binds" must be given some
>> > lexical context; it defaults to the lexical context of the literal set
>> > name.
>> > In define-sc2 from your example, that is inner, which is introduced by
>> > the
>> > macro and thus has a mark on it. So it's just like you had a literals
>> > list
>> > consisting of a marked +. But the + in the pattern is unmarked, since it
>> > comes from the original program (via the lit pattern variable). They
>> > don't
>> > match, so the identifier in the pattern is interpreted as a pattern
>> > variable.
>> >
>> >
>> >> The third and fourth are ways to get it to, but I'm
>> >>
>> >> not sure why they work. The third seems to show that the meaning of
>> >> how a literal set binds variables depends on the context of the
>> >> literal-set name and not the actual literals.
>> >
>> >
>> > In define-sc3, inner also comes from the original program, so it's
>> > unmarked,
>> > so the literals consist of an unmarked +, which matches the + in the
>> > pattern.
>> >
>> >
>> >> The fourth way uses #:at
>> >>
>> >> which is documented as being useful to macros, but not much else. Can
>> >> someone explain how literal-sets are supposed to determine whether or
>> >> not a variable in the pattern is treated as a literal or as a pattern
>> >> variable?
>> >
>> >
>> > An #:at clause changes the lexical context given to literals. In
>> > define-sc4,
>> > inner is back to being marked, but that doesn't matter, because the
>> > literals
>> > are given the lexical context of the syntax bound to name. In the
>> > example,
>> > that's sc4, which comes from the macro use and is thus unmarked.
>> >
>> > All unhygienic binding forms have similar difficulties. Macros that
>> > expand
>> > into require forms are another common source of confusion.
>> >
>> > Ryan
>> >
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>>
>

Posted on the users mailing list.