[plt-scheme] Natural Language parsing in CS1

From: Gary Baumgartner (gfb at cs.toronto.edu)
Date: Mon Jun 1 21:22:32 EDT 2009

On Mon, Jun 01, 2009 at 02:42:58PM -0400, Todd O'Bryan wrote:
> I'm sorry I wasn't as clear as I should have been...
> So I'd write the parser and such and let the kids just play with the grammars.

Earlier you wondered about how to write it in PLT Scheme.
 Does this get you off the ground:

#lang scheme

#| Example parser using cfg defined below,
    for a grammar at the start of CSC324 (one of our courses).

   The resulting p takes a string s, parses all prefixes of s,
    returning the suffix of each prefix that's in the language,
    repeated the number of times it parses (only once for this p,
    since unambiguous).

   E.g. (p "011")  => ("1" "011"), so no complete parse of s,
        (p "1011") => ("" "11" "1011"), so s is in the language,

(define p (cfg <start>
               (<start> (<digit> <digit> <start>)
               (<digit> ("0") ("1"))))

#| A syntax for cfgs, producing a recursive descent parser by expanding
    into definitions of recursive functions for the productions
    (and keeping the macro simple by delegating most of the work except
     notation translation to match-concatenation).
(require (planet "amb.ss" ("murphy" "amb.plt" 1 1)))
(define-syntax-rule (cfg <<start>>
                         (<<rule>> (<component> ...)
  (letrec ([<<rule>> (lambda (string)
        	       (amb (match-concatenation string <component> ...)
    (lambda (string) (amb-collect (<<start>> string)))))
(require (only-in srfi/13 string-null? string-prefix? string-drop))
(define (match-concatenation string . components)
  (define (match-component component string)
    (cond [(equal? component 'e) string]
          [(string? component)
           (amb-assert (string-prefix? component string))
           (string-drop string (string-length component))]
          [else (component string)]))
  (foldl match-component string components))

Posted on the users mailing list.