[racket] Need some help for my first real experiment with scheme
On Wed, Apr 18, 2012 at 1:15 PM, Danny Yoo <dyoo at cs.wpi.edu> wrote:
>>
>> I think the subfield you're looking for is called "information retrieval",
>> and there are textbooks on it.
>
> Managing Gigabytes, for example:
>
> http://ww2.cs.mu.oz.au/mg/
Another book that just came out that looks good is: Introduction to
Information Retrieval:
http://nlp.stanford.edu/IR-book/
It's awesome that they've put the book online.
Aside: the suffix-tree approach I proposed earlier might be
inappropriate for the task you're exploring, because they represent
all suffixes of the text. Just a heads up to keep watch over memory
usage, because suffix trees are heavy.