[plt-scheme] PLT lexer generator

From: Scott Owens (sowens at cs.utah.edu)
Date: Thu Feb 20 17:03:48 EST 2003

I have changed the regular expression syntax of the lexer generator in 
a way that will break most existing lexers.  I'm sorry about this, but 
I will sleep better now that a major design flaw in the syntax has been 
repaired.

Previously the regular expression grammar had productions that said a 
symbol was a regular expression and (symbol) was also a regular 
expression.  A symbol would match the sequence of characters in the 
symbol, while (symbol) expanded to the named regular expression 
abbreviation.  This design was intended to reduce the number of quote 
marks by allowing r.e.s like (: + = - *) instead of (: "+" "=" "-" "*") 
or (: #\+ #\= #\- #\*).  In practice it gave rise to two insidious 
classes of bugs.  The worst was when meaning to write out an 
abbreviation, the parentheses would occasionally be forgotten.  Someone 
would intend to write (comment) to match a comment, but would instead 
write comment, which would match the string "comment".  The less severe 
complication was that read might alter the case of the symbol.  Unless 
read was explicitly made case-sensitive in the right places, writing 
Help would actually match the string "help".

To fix these problems, I removed (symbol) from regular expressions and 
caused symbol to expand lex abbreviations.  Thus existing any lexer 
which uses abbreviations must have the parentheses removed from uses of 
the abbreviations and lexers which use symbols to match sequences of 
characters must replace them with strings of characters.

If you have any problems or questions regarding this change, do not 
hesitate to write me.

-Scott



Posted on the users mailing list.