[plt-scheme] Delimiting tokens
From: Lauri Alanko (la at iki.fi)
Date: Mon Jan 30 14:58:00 EST 2006 |
|
I'm working on a parser whose input includes s-expressions. I'm using
the standard reader for those. However, there is a slight problem.
Suppose that ~ is an escape character that indicates that what follows
is an s-expression. Then the following works ok:
foo bar baz ~(some-expression) quux quuux
foo bar baz ~some-identifier quux quuux
foo bar baz~(some-expression)quux quuux
However, if the s-expression is non-self-delimiting, e.g. an identifier,
and the following character in the input is also acceptable to the
reader, then things go wrong:
foo bar baz~some-identifierquux quuux
Now, I could of course alter the syntax so that there is a terminator
character after the s-expression that gets consumed before the parser
proceeds normally. ; is a good choice since the reader will definitely
stop on that:
foo bar baz~some-identifier;quux quuux
However, this makes things ugly and verbose, and for consistency the ;
would have to be there even when the s-expression is self-delimiting
(which is the usual case).
It would be nice to have a reader mechanism by which to delimit
identifiers and other similar tokens. In fact, I know just the perfect
candidate. Many implementations of Scheme readers have a minor quirk in
their handling of dotted lists:
guile> '( . x )
x
This looks strange, but is fairly logical. It is actually a bit easier
to write a reader that accepts the above than one that rejects it.
MzScheme has always been strict in this regard and rejected it.
Until now I had always thought that the above form was only confusing
and could serve no possible use. But now I realize that it is the
perfect solution for making any s-expression self-delimiting.
foo bar baz~(. some-identifier)quux quuux
It's concise, logical and in line with other implementations. I think
mzscheme's reader should support it. At least subject to a suitable
parameter, (read-accept-dot 'initial) perhaps?
Lauri