[plt-scheme] abstracting parsers

From: Scott Owens (sowens at cs.utah.edu)
Date: Fri May 7 21:31:13 EDT 2004

On Friday 07 May 2004 06:54 pm, Daniel Silva wrote:
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
> Is this a problem of macros in general?

No.  The parser macro treats the grammar literally, as a grammar, so you 
cannot put a macro call inside the grammar.  Usually, macros treat their 
sub-expressions as Scheme expressions or definitions so the usual methods of 
Scheme abstraction are available.


> I noticed that I've started writing too many rules like this one:
>
> (hex [(A) $1]
>      [(B) $1]
>      ...
>      [(ZERO) $1]
>      [(ONE) $1]
>      ...)
>
> and would rather write:
>
> (same-as hex
>          ZERO ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE A B C D E F)
>
> but can't define a same-as macro here.. I need to write some parser++
> syntax that includes same-as.

Firstly, this sort of thing is usually handled in the lexer, so make sure you 
actually need it in the parser.  If you are writing many parsers that need 
this same non-terminal (and the multiple start symbol trick won't work), then 
you would need to write a parser++ macro that goes through the grammar 
translating some form of special symbols into productions and emits an 
(parser ...) form.  In most circumstances, this technique is more trouble 
than it is worth.

I might be able to modify the parser so that you could define abbreviations 
for sequences of grammar symbols (like the lex-abbrev for regular 
expressions) and use the abbreviations on the right-hand side of productions.
Would this help you much?  It would only be of use when defining multiple 
parsers, since the grammar itself can express this abstraction inside a 
single parser.

>
> Or do I (hopefully) misunderstand the macro system?

I don't have enough information to make that judgment.

-Scott

> On Fri, 7 May 2004, Scott Owens wrote:
> >   For list-related administrative tasks:
> >   http://list.cs.brown.edu/mailman/listinfo/plt-scheme
> >
> > That is not directly possible.  The same grammar can define multiple
> > parsers by using multiple start non-terminals, but grammars cannot use
> > non-terminals from other grammars.  You could define your own syntactic
> > abstraction for parsers that expands into the `parser' form.
> >
> > -Scott
> >
> > On Friday 07 May 2004 04:50 pm, Daniel Silva wrote:
> > >   For list-related administrative tasks:
> > >   http://list.cs.brown.edu/mailman/listinfo/plt-scheme
> > >
> > > I'm working on a message parser for some protocol defined in an RFC.
> > > Parts of those messages (some non-terminals) are defined in other RFCs.
> > > It would be nice to be able to refer to non-terminals from other
> > > grammars. Is this possible?
> > >
> > > For example, something like:
> > >
> > > ;; RFC 2396 defines Uniform Resource Identifiers (URIs)
> > >
> > > (require (prefix rfc2396: (lib "RFC2396.ss"
> > >                                "IETF" "RFCs")))
> > >
> > > (parser
> > >   (grammar
> > >      ...
> > >      (Request-Line [(Method SP Request-URI SP HTTP-Version CRLF)
> > >                     (make-request-line $1 $3 $5)])
> > >
> > >      (Request-URI [(ASTERISK) $1]
> > >                   [(rfc2396:absolute-uri) $1]
> > >                   [(rfc2396:abs_path) $1]
> > >                   [(rfc2396:authority) $1])
> > >
> > >      ...))


Posted on the users mailing list.