[racket-dev] wrong syntax position and span when using unicode lambda character?

From: Stephen Chang (stchang at ccs.neu.edu)
Date: Mon Jun 11 19:57:28 EDT 2012

I have a drracket plugin that reads the contents of the definitions
window as syntax and then traverses that syntax object so that for
each subexpression e it prints
1) (syntax->datum e), and
2) the contents of the definitions window at starting at position
(syntax-position e) and ending at (syntax-position e) + (syntax-span
e)

I would expect each subexpression in the program to be printed twice
and the plugin generally works as expected, except when I use the
unicode lambda character.


For example, if I have the program:

#lang racket (add1 10)

the (add1 10) syntax object correctly has position 14 and span 9 and
the add1 syntax object correctly has position 15 and span 4.


But for the program:

#lang racket (λ (x) x)

the (λ (x) x) syntax object has position 14 but span 11 (2 more than
expected) and the λ syntax object has position 17 (again, 2 more than
expected)


Has anyone seen this behavior before?


I'll try to distill my plugin to get something reproducible if it's
not obvious what's going on.

(I think that syntax positions start at 0 and editor positions start
at 1 but I account for that so that's not the cause of the problem.)

(I'm also aware that evaluating (syntax-span #'(λ (x) x)) in the repl
correctly produces 9 so I don't know what's going on.)


Posted on the dev mailing list.