[racket] syntax-parse, macros, and literal-sets

From: Ryan Culpepper (ryanc at ccs.neu.edu)
Date: Fri May 31 14:47:03 EDT 2013

This might answer some of your questions: 
http://macrologist.blogspot.com/2011/09/syntax-parse-and-literals.html

Ryan


On 05/31/2013 02:30 AM, Eric Dobson wrote:
> Ok given the complexities of literal-sets and hygiene, I think I will
> avoid them if I can. The issue is that I cannot seem to get the same
> behaviour from literal lists as I do from literal-sets. Literal sets
> will give an error an expansion time if there is not a binding for a
> specified literal, but literal lists do not do this even when I
> specify a phase for the binding.
>
> #lang racket
>
> (require syntax/parse)
> (require (for-template (only-in racket/base list)))
>
> (syntax-parse #'list
>     #:literals ((list list #:phase -1))
>     (list #f))
>
> ;; Raises the error at runtime
> (syntax-parse #'list
>     #:literals ((map map #:phase -1))
>     (map #f))
>
> (define-literal-set literals1 #:for-template (list))
>
> ;; Gives the error at compile time if uncommented
> ;(define-literal-set literals2 #:for-template (map))
>
> I'm not sure I understand the reason for this limitation.
>
>
> On Thu, May 30, 2013 at 9:43 AM, Carl Eastlund <cce at ccs.neu.edu> wrote:
>> On Thu, May 30, 2013 at 12:25 PM, Eric Dobson <eric.n.dobson at gmail.com>
>> wrote:
>>>
>>> Why do literal-sets have to be unhygienic? I expected them to work
>>> like literal lists which you say are hygienic. Also literal-sets are
>>> not documented as being unhygienic, which was the confusing part.
>>
>>
>> Literal sets are hygienic in the informal sense of "follow a disciplined
>> notion of binding structure".  They are not hygienic in the formal sense of
>> Scheme and Racket expansion wherein names derive their scope from where they
>> appear in the source.  Specifically, when you use a literal set, you do not
>> write down the names in the set, yet they are scoped as if you wrote them
>> right there in your program.  Which is what you want -- to use those names
>> in the current scope -- but is not hygienic, by the strict definition.
>> Anything where you "store" names and pull them out later -- module exports,
>> unit signatures, literal sets, and other similar tools -- is necessarily
>> unhygienic.  Using them in surface syntax is relatively straightforward;
>> using them in macros is tricky, because these tools have to introduce the
>> stored names into some lexical context for you to use, and with a macro that
>> could be either the macro's context or the macro user's context.
>>
>>>
>>> In my example 2 both the literals in the literal-list and the ids used
>>> to match the syntax have the same marks, so it is very unintuitive
>>> that the lexical context of the literal-set name should matter. Also
>>> can literal sets be used to abstract over identifiers with different
>>> lexical contexts, intuitively since other forms can do that I would
>>> expect that they should be able to, but given that its unhygienic I'm
>>> not so sure now.
>>
>>
>> Because literal-sets can be used in any context, the names used in the
>> literal set are used in the context of the reference to the literal set.  So
>> if a macro injects the name of the literal set to a syntax-parse expression,
>> the macro needs to determine whether that literal set is used in the macro's
>> context or the macro user's context.  Otherwise it could be difficult to use
>> the names in the macro itself.  Hence the #:at option, which lets macros
>> specify the context "at" which the literal set is used, if it's different
>> from the context naming the literal set.
>>
>> The short, short version: use the #:at option if your macro uses literal
>> sets "on behalf of" the user.
>>
>> The syntax-parse documentation could probably be more explicit about how
>> literals in literal sets derive their context; I do not envy Ryan the task
>> of writing that up, because I'm not sure how to make it 100% clear myself,
>> as my ramblings above may show.
>>
>>>
>>> On Thu, May 30, 2013 at 3:49 AM, Ryan Culpepper <ryanc at ccs.neu.edu> wrote:
>>>> On 05/29/2013 03:30 AM, Eric Dobson wrote:
>>>>>
>>>>> I was writing a macro that generated a literal-set and ran into some
>>>>> confusing behavior which I have distilled to the following program.
>>>>>
>>>>> #lang racket
>>>>> (require syntax/parse
>>>>>            (for-syntax syntax/parse))
>>>>>
>>>>>
>>>>> (define-syntax (define-ls1 stx)
>>>>>     (syntax-parse stx
>>>>>       ((_ name:id (lit:id ...))
>>>>>        #'(define-literal-set name (lit ...)))))
>>>>>
>>>>> (define-ls1 ls1 (+))
>>>>> (define-syntax-class sc1
>>>>>     #:literal-sets (ls1)
>>>>>     (pattern +))
>>>>>
>>>>> (for/list ((x (list #'+ #'*)))
>>>>>     (syntax-parse x
>>>>>       (x:sc1 #t)
>>>>>       (_ #f)))
>>>>>
>>>>>
>>>>> (define-syntax (define-sc2 stx)
>>>>>     (syntax-parse stx
>>>>>       ((_ name:id (lit:id ...))
>>>>>        #'(begin
>>>>>            (define-literal-set inner (lit ...))
>>>>>            (define-syntax-class name
>>>>>              #:literal-sets (inner)
>>>>>              (pattern lit) ...)))))
>>>>>
>>>>> (define-sc2 sc2 (+))
>>>>> (for/list ((x (list #'+ #'*)))
>>>>>     (syntax-parse x
>>>>>       (x:sc2 #t)
>>>>>       (_ #f)))
>>>>>
>>>>> (define-syntax (define-sc3 stx)
>>>>>     (syntax-parse stx
>>>>>       ((_ name:id inner:id (lit:id ...))
>>>>>        #'(begin
>>>>>            (define-literal-set inner (lit ...))
>>>>>            (define-syntax-class name
>>>>>              #:literal-sets (inner)
>>>>>              (pattern lit) ...)))))
>>>>>
>>>>> (define-sc3 sc3 inner3 (+))
>>>>> (for/list ((x (list #'+ #'*)))
>>>>>     (syntax-parse x
>>>>>       (x:sc3 #t)
>>>>>       (_ #f)))
>>>>>
>>>>>
>>>>> (define-syntax (define-sc4 stx)
>>>>>     (syntax-parse stx
>>>>>       ((_ name:id (lit:id ...))
>>>>>        #'(begin
>>>>>            (define-literal-set inner (lit ...))
>>>>>            (define-syntax-class name
>>>>>              #:literal-sets ((inner #:at name))
>>>>>              (pattern lit) ...)))))
>>>>>
>>>>> (define-sc4 sc4 (+))
>>>>> (for/list ((x (list #'+ #'*)))
>>>>>     (syntax-parse x
>>>>>       (x:sc4 #t)
>>>>>       (_ #f)))
>>>>>
>>>>> This produces the output:
>>>>> '(#t #f)
>>>>> '(#t #t)
>>>>> '(#t #f)
>>>>> '(#t #f)
>>>>>
>>>>> I would have expected the second one to return '(#t #f) like the first
>>>>> but it doesn't.
>>>>
>>>>
>>>> The issue is how syntax-parse decides whether an identifier in a pattern
>>>> is
>>>> a pattern variable or a literal. Let's take the simple case, where we
>>>> have
>>>> just a literals list. From the standpoint of hygiene, the literals list
>>>> is
>>>> "binding-like" because it sets the interpretation for "references" in
>>>> the
>>>> patterns. That means the relevant comparison is bound-identifier=? (like
>>>> all
>>>> binding forms), and that means that at a minimum if you want an
>>>> identifier
>>>> in a pattern to be considered a literal, it must have the same marks as
>>>> the
>>>> corresponding identifier in the literals list. Consider the following
>>>> program:
>>>>
>>>> (define-syntax-rule
>>>>    (define-syntax-rule/arith (macro . pattern) template)
>>>>    (define-syntax macro
>>>>      (syntax-rules (+ - * /)
>>>>        [(macro . pattern) template])))
>>>>
>>>> (define-syntax-rule/arith (sum-left-arg (+ x y)) x)
>>>>
>>>> One might expect sum-left-arg to raise a syntax error if given a term
>>>> with
>>>> something other than + in operator position. But in fact:
>>>>
>>>> (sum-left-arg (* 1 2))
>>>> ;; => 1
>>>>
>>>> The reason is because the expansion of define-syntax-rule/arith puts
>>>> marks
>>>> on the + in the literals list. So only +'s with the same mark in the
>>>> pattern
>>>> are considered literals; all others are pattern variables. In
>>>> particular,
>>>> the + in the pattern in the definition of sum-left-arg is unmarked, so
>>>> it's
>>>> a pattern variable.
>>>>
>>>> Now back to literal sets. A #:literal-sets clause is an essentially like
>>>> an
>>>> unhygienic binding form. The identifiers it "binds" must be given some
>>>> lexical context; it defaults to the lexical context of the literal set
>>>> name.
>>>> In define-sc2 from your example, that is inner, which is introduced by
>>>> the
>>>> macro and thus has a mark on it. So it's just like you had a literals
>>>> list
>>>> consisting of a marked +. But the + in the pattern is unmarked, since it
>>>> comes from the original program (via the lit pattern variable). They
>>>> don't
>>>> match, so the identifier in the pattern is interpreted as a pattern
>>>> variable.
>>>>
>>>>
>>>>> The third and fourth are ways to get it to, but I'm
>>>>>
>>>>> not sure why they work. The third seems to show that the meaning of
>>>>> how a literal set binds variables depends on the context of the
>>>>> literal-set name and not the actual literals.
>>>>
>>>>
>>>> In define-sc3, inner also comes from the original program, so it's
>>>> unmarked,
>>>> so the literals consist of an unmarked +, which matches the + in the
>>>> pattern.
>>>>
>>>>
>>>>> The fourth way uses #:at
>>>>>
>>>>> which is documented as being useful to macros, but not much else. Can
>>>>> someone explain how literal-sets are supposed to determine whether or
>>>>> not a variable in the pattern is treated as a literal or as a pattern
>>>>> variable?
>>>>
>>>>
>>>> An #:at clause changes the lexical context given to literals. In
>>>> define-sc4,
>>>> inner is back to being marked, but that doesn't matter, because the
>>>> literals
>>>> are given the lexical context of the syntax bound to name. In the
>>>> example,
>>>> that's sc4, which comes from the macro use and is thus unmarked.
>>>>
>>>> All unhygienic binding forms have similar difficulties. Macros that
>>>> expand
>>>> into require forms are another common source of confusion.
>>>>
>>>> Ryan
>>>>
>>> ____________________
>>>    Racket Users list:
>>>    http://lists.racket-lang.org/users
>>>
>>


Posted on the users mailing list.