[plt-scheme] Delimiting tokens

From: Lauri Alanko (la at iki.fi)
Date: Mon Jan 30 14:58:00 EST 2006

I'm working on a parser whose input includes s-expressions. I'm using
the standard reader for those. However, there is a slight problem.
Suppose that ~ is an escape character that indicates that what follows
is an s-expression. Then the following works ok:

foo bar baz ~(some-expression) quux quuux
foo bar baz ~some-identifier quux quuux
foo bar baz~(some-expression)quux quuux

However, if the s-expression is non-self-delimiting, e.g. an identifier,
and the following character in the input is also acceptable to the
reader, then things go wrong:

foo bar baz~some-identifierquux quuux

Now, I could of course alter the syntax so that there is a terminator
character after the s-expression that gets consumed before the parser
proceeds normally. ; is a good choice since the reader will definitely
stop on that:

foo bar baz~some-identifier;quux quuux

However, this makes things ugly and verbose, and for consistency the ;
would have to be there even when the s-expression is self-delimiting
(which is the usual case).

It would be nice to have a reader mechanism by which to delimit
identifiers and other similar tokens. In fact, I know just the perfect
candidate. Many implementations of Scheme readers have a minor quirk in
their handling of dotted lists:

guile> '( . x )

This looks strange, but is fairly logical. It is actually a bit easier
to write a reader that accepts the above than one that rejects it.
MzScheme has always been strict in this regard and rejected it.

Until now I had always thought that the above form was only confusing
and could serve no possible use. But now I realize that it is the
perfect solution for making any s-expression self-delimiting.

foo bar baz~(. some-identifier)quux quuux

It's concise, logical and in line with other implementations. I think
mzscheme's reader should support it. At least subject to a suitable
parameter, (read-accept-dot 'initial) perhaps?


Posted on the users mailing list.