[racket] Fwd: Performance help

From: Jyotirmoy Bhattacharya (jyotirmoy at jyotirmoy.net)
Date: Thu Jan 1 23:13:21 EST 2015

Dear Greg,

Thanks for your suggestions.

On Fri, Jan 2, 2015 at 12:42 AM, Greg Hendershott <greghendershott at gmail.com
> wrote:

>
> 2. The output differs from your repo's output.txt on one item: It
> corrects "accesing" to "accusing" rather than "acceding".
>

The two words have the same frequency in the training set and are the same
edit distance away from the test word, so I think this is not a bug.

>
> Next:
>
> (append-map edits1 (edits1 s))
>
> is very large -- 40,000+ items for "cat". But it looks like Norvig
> prunes the edit distance 2 list to known words? Say where `ht` is the
> training dict:
>
> (define (edits2 ht s)
>   (for*/list ([x (in-list (edits1 s))]
>               [y (in-list (edits1 x))]
>               #:when (hash-ref ht y #f))
>     y))
>
> That yields 2000+ items for "cat".
>
> When I make that change, my run time decreases from ~16s to ~10s, and
> produces the same output (which differs from output.txt in the same
> way I mentioned above).
>

In relative terms this would probably get it close to the Python version?
>

I merged a pull request from Matthias Felleisen who made the same change.
This definitely increases the performance but the Racket program is still
about 2x slower than Python.

Regards,
Jyotirmoy Bhattacharya
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20150102/758d9e88/attachment-0001.html>

Posted on the users mailing list.