[plt-scheme] Natural Language parsing in CS1

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Mon Jun 1 10:02:30 EDT 2009

Let me recommend an alternative.

Provide a data definition for parse trees to your students
and just ask them to write functions that process these parse
trees. You may/should need a teachpack that helps you represent
parse trees, but it is much less work than what you propose.
For a decent experience, your teachpack may also provide some
large examples of parse trees.

One of the functions you can assign then is the sentence diagramming
thing you sketch below. (I have done something like that once.)

If the project is a success, you can expand it in a second semester
(year) with the parsing function that you propose here.

-- Matthias

On Jun 1, 2009, at 9:14 AM, Todd O'Bryan wrote:

>  I'm starting to think about my courses for next year, and one thing
> that I really want to do with my intro classes is introduce topics
> that require the same kinds of thinking as programming, but feel
> different enough that students will see them as something different.
> For example, I think I'm going to do a short unit on finite state
> automata and regular expressions (and I'll use something like the
> macros Shriram presents in his "The Swine Before Perl" talk
> http://www.cs.brown.edu/~sk/Publications/Talks/SwineBeforePerl/ ).
> I also thought I could do a short unit on parsing, in order to
> introduce the idea of context-free grammars. What I'd like to do is
> create a teachpack that would take code like this:
> (define-grammar G
>  ((S -> NP VP)
>   (NP -> "Bob")
>   (VP -> "ate")))
> and then
> (parse G S "Bob ate")
> would produce a parse tree
> (S (NP ("Bob"))
>    (VP ("ate")))
> and ideally, I could write a function that would display this as text
> (this looks good in monospace):
>      S
> +----+----+
> NP        VP
> |         |
> "Bob"    "ate"
> and maybe even display the tree graphically.
> Eventually, I'd like to do some kind of features:
> (define-grammar G
>   ((S -> NP[number: X] VP[number: X])
>    (NP[number: sing] -> "he")
>    (NP[number: plur] -> "they")
>    (VP[number: sing] -> "sleeps")
>    (VP[number: plur] -> "sleep")))
> So that "He sleeps" and "They sleep" parse, but *"They sleeps" and
> *"He sleep" don't.
> So, the question...
> Where do I start? I can write a parser that produces a list of all
> possible parses given a grammar and a list of terminals, but I'm not
> sure how to approach this in Scheme. After the recent comment that OO
> doesn't scale down, I'm wondering if Grammar and Tree are worth my
> first attempt at classes in Scheme. I think I should probably use
> strings for the terminals, but should I use symbols for the symbols of
> the grammar or is there something built in that would handle
> attribute-value pairs nicely if I get that far?
> Any suggestions appreciated. I'll try to publish back what I come up
> with, along with whatever assignments I come up with.
> Thanks,
> Todd
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme

Posted on the users mailing list.