<div dir="ltr">Also on more than one occasion I've had to work around issues with entry scaling with even mutable hashtables, A few 100K entries, they are fine but rapidly grind to a halt. The Python folks have hash tables nailed, über performance and scaling.</div>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Feb 14, 2013 at 2:20 PM, James Bergstra <span dir="ltr"><<a href="mailto:james.bergstra@gmail.com" target="_blank">james.bergstra@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I was actually trying to understand the performance bottlenecks of the<br>
"knucleotide" challenge of the language shootout:<br>
<br>
<a href="http://benchmarksgame.alioth.debian.org/u64/program.php?test=knucleotide&lang=racket&id=4" target="_blank">http://benchmarksgame.alioth.debian.org/u64/program.php?test=knucleotide&lang=racket&id=4</a><br>
<br>
Most of the time seems to be spent in the loop that builds the hash table.<br>
<br>
(Incidentally, in that same loop, I tried replacing the set! key with<br>
a for/fold and that helped on my home computer but not the one at<br>
work. Maybe I had different versions of racket, I'm not sure)<br>
<span class="HOEnZb"><font color="#888888"><br>
- James<br>
</font></span><div class="im HOEnZb"><br>
On Thu, Feb 14, 2013 at 2:15 PM, Robby Findler<br>
<<a href="mailto:robby@eecs.northwestern.edu">robby@eecs.northwestern.edu</a>> wrote:<br>
</div><div class="HOEnZb"><div class="h5">> hash-update is defined in Racket and, besides the error checking, is<br>
> basically the composition of those two functions. I expect it was written as<br>
> something to abstract over a common pattern, not speed things up.<br>
><br>
> Is this a bottleneck in some real program? Perhaps you'd share that (or some<br>
> inner loop of it)? I wonder if there is a better way to speed it up than<br>
> this.<br>
><br>
> Robby<br>
><br>
><br>
> On Thu, Feb 14, 2013 at 1:09 PM, J. Ian Johnson <<a href="mailto:ianj@ccs.neu.edu">ianj@ccs.neu.edu</a>> wrote:<br>
>><br>
>> If I am right, then it would make sense to use hash-update when the hash<br>
>> is big enough that a log-time traversal takes longer than a general<br>
>> procedure call's set-up and tear-down.<br>
>> -Ian<br>
>> ----- Original Message -----<br>
>> From: James Bergstra <<a href="mailto:james.bergstra@gmail.com">james.bergstra@gmail.com</a>><br>
>> To: J. Ian Johnson <<a href="mailto:ianj@ccs.neu.edu">ianj@ccs.neu.edu</a>><br>
>> Cc: <a href="mailto:users@racket-lang.org">users@racket-lang.org</a><br>
>> Sent: Thu, 14 Feb 2013 13:47:51 -0500 (EST)<br>
>> Subject: Re: [racket] Question about hash table efficiency<br>
>><br>
>> Thanks, I suppose I just had my hopes up. I thought the more compact<br>
>> form might involve fewer expression evaluations and require just one<br>
>> hash lookup instead of two, and therefore might run something like<br>
>> twice as fast, rather than slightly slower.<br>
>><br>
>> On Thu, Feb 14, 2013 at 1:36 PM, J. Ian Johnson <<a href="mailto:ianj@ccs.neu.edu">ianj@ccs.neu.edu</a>> wrote:<br>
>> > My guess would be the higher-orderness on hash-update causes the (minor)<br>
>> > slowdown.<br>
>> > -Ian<br>
>> > ----- Original Message -----<br>
>> > From: James Bergstra <<a href="mailto:james.bergstra@gmail.com">james.bergstra@gmail.com</a>><br>
>> > To: <a href="mailto:users@racket-lang.org">users@racket-lang.org</a><br>
>> > Sent: Thu, 14 Feb 2013 13:16:47 -0500 (EST)<br>
>> > Subject: [racket] Question about hash table efficiency<br>
>> ><br>
>> > Hi list, just discovered racket and having fun learning about scheme<br>
>> > again. Haven't used it since undergrad. Thanks!<br>
>> ><br>
>> > I was wondering about the efficiency of hash-update vs. hash-set and<br>
>> > hash-ref. I thought the former would be faster, otherwise why have it?<br>
>> > A little benchmark script reveals the same surprising result as my<br>
>> > actual application though:<br>
>> ><br>
>> > """<br>
>> > #lang racket<br>
>> ><br>
>> > ;; warm-up<br>
>> > (time (void<br>
>> > (for/fold ([table (make-immutable-hasheq '())])<br>
>> > ([ii (in-range 100000)])<br>
>> > (hash-update table (modulo ii 100) add1 ii))))<br>
>> ><br>
>> > ;; update<br>
>> > (newline)<br>
>> > (time (void<br>
>> > (for/fold ([table (make-immutable-hasheq '())])<br>
>> > ([ii (in-range 100000)])<br>
>> > (hash-update table (modulo ii 100) add1 ii))))<br>
>> ><br>
>> > ;; set/ref<br>
>> > (newline)<br>
>> > (time (void<br>
>> > (for/fold ([table (make-immutable-hasheq '())])<br>
>> > ([ii (in-range 100000)])<br>
>> > (hash-set table (modulo ii 100) (add1 (hash-ref table (modulo ii 100)<br>
>> > ii))))))<br>
>> ><br>
>> > """<br>
>> ><br>
>> > This prints out:<br>
>> ><br>
>> > cpu time: 112 real time: 114 gc time: 32<br>
>> ><br>
>> > cpu time: 80 real time: 82 gc time: 0<br>
>> ><br>
>> > cpu time: 76 real time: 75 gc time: 4<br>
>> ><br>
>> ><br>
>> > My original application used mutable hash tables and the same effect<br>
>> > was observed (if not more dramatically). Any thoughts on why the<br>
>> > update form is slower? I'm running this in Racket v5.1.1.<br>
>> ><br>
>> > - James<br>
>> > ____________________<br>
>> > Racket Users list:<br>
>> > <a href="http://lists.racket-lang.org/users" target="_blank">http://lists.racket-lang.org/users</a><br>
>> ><br>
>><br>
>> ____________________<br>
>> Racket Users list:<br>
>> <a href="http://lists.racket-lang.org/users" target="_blank">http://lists.racket-lang.org/users</a><br>
><br>
><br>
____________________<br>
Racket Users list:<br>
<a href="http://lists.racket-lang.org/users" target="_blank">http://lists.racket-lang.org/users</a><br>
</div></div></blockquote></div><br></div>