[racket] Performance help
I've just submitted a pull request with two changes
https://github.com/jmoy/norvig-spell/pull/3 . In my machine, these
changes reduce the time of the tests from 17.7 seconds, to 16.2
seconds, and finally to 14.5 seconds.
The original code is something like:
(define ALPHABET (in-list (string->list "abc...xyz")))
(for/list ([c ALPHABET]))
(string-append "?" (string c) "?"))
The first problem is that outside a clause of a `for` the `in-list`
creates a sequence. But inside a clause of a `for` the `in-list` is
spliced and generates more efficient code.
(define ALPHABET (string->list "abc...xyz"))
(for/list ([c (in-list ALPHABET)]))
(string-append "?" (string c) "?"))
The second problem is that for each word `(string c)` creates a new
temporary string, so I mapped `string` at the definition site.
(define ALPHABET (map string (string->list "abc...xyz")))
(for/list ([c (in-list ALPHABET)]))
(string-append "?" c "?"))
---
Following the idea of the second change I tried to remove more of the
temporary strings in the code ( for example, `(substring x 1)` looks
like a bad idea). But my tries were unsuccessful to make the code
faster :(.
I also read the code in the other language, and if I understood
correctly they are using strings that only contain "ascii" characters
instead of strings for "unicode" characters. I think the closest
equivalent in Racket is to use `bytes` instead of `strings`. It should
be slightly faster, because they are easier to handle at the low
level. I tried this and a few tricks, but the improvement was minimal
(something like a reduction from 14.5 seconds to ~14 seconds?).
Gustavo
On Fri, Jan 2, 2015 at 11:18 PM, Matthias Felleisen
<matthias at ccs.neu.edu> wrote:
>
> On Jan 2, 2015, at 11:38 AM, Jens Axel Søgaard wrote:
>
>> 2015-01-02 17:10 GMT+01:00 Matthias Felleisen <matthias at ccs.neu.edu>:
>>>
>>> Jens -- thanks for re-directing us to your article. This saved me quite
>>> some time. I noticed the lacking abstractions (slicing, comprehension)
>>> and considered sketching a prototype to use with this program. I decided
>>> to wait till today because it didn't address what bothered me most:
>>>
>>> speed.
>>
>> One trick: change the regular expression in words in order to
>> get rid of the call to string-downcase:
>>
>> (regexp-split #rx"[^a-zA-Z]" buf)
>>
>> Another trick: the regular expression matchers in Racket works both
>> on strings and ports, so train can be written
>>
>> (define (train fname)
>> (freqs (words (open-input-file fname))))
>>
>> I am not sure whether it will give a speedup though.
>
>
> Both of these caused a serious slowdown. -- Matthias
>
>
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users