[racket] Question about hash table efficiency

From: James Bergstra (james.bergstra at gmail.com)
Date: Thu Feb 14 14:20:35 EST 2013

I was actually trying to understand the performance bottlenecks of the
"knucleotide" challenge of the language shootout:

http://benchmarksgame.alioth.debian.org/u64/program.php?test=knucleotide&lang=racket&id=4

Most of the time seems to be spent in the loop that builds the hash table.

(Incidentally, in that same loop, I tried replacing the set! key with
a for/fold and that helped on my home computer but not the one at
work. Maybe I had different versions of racket, I'm not sure)

- James

On Thu, Feb 14, 2013 at 2:15 PM, Robby Findler
<robby at eecs.northwestern.edu> wrote:
> hash-update is defined in Racket and, besides the error checking, is
> basically the composition of those two functions. I expect it was written as
> something to abstract over a common pattern, not speed things up.
>
> Is this a bottleneck in some real program? Perhaps you'd share that (or some
> inner loop of it)? I wonder if there is a better way to speed it up than
> this.
>
> Robby
>
>
> On Thu, Feb 14, 2013 at 1:09 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
>>
>> If I am right, then it would make sense to use hash-update when the hash
>> is big enough that a log-time traversal takes longer than a general
>> procedure call's set-up and tear-down.
>> -Ian
>> ----- Original Message -----
>> From: James Bergstra <james.bergstra at gmail.com>
>> To: J. Ian Johnson <ianj at ccs.neu.edu>
>> Cc: users at racket-lang.org
>> Sent: Thu, 14 Feb 2013 13:47:51 -0500 (EST)
>> Subject: Re: [racket] Question about hash table efficiency
>>
>> Thanks, I suppose I just had my hopes up. I thought the more compact
>> form might involve fewer expression evaluations and require just one
>> hash lookup instead of two, and therefore might run something like
>> twice as fast, rather than slightly slower.
>>
>> On Thu, Feb 14, 2013 at 1:36 PM, J. Ian Johnson <ianj at ccs.neu.edu> wrote:
>> > My guess would be the higher-orderness on hash-update causes the (minor)
>> > slowdown.
>> > -Ian
>> > ----- Original Message -----
>> > From: James Bergstra <james.bergstra at gmail.com>
>> > To: users at racket-lang.org
>> > Sent: Thu, 14 Feb 2013 13:16:47 -0500 (EST)
>> > Subject: [racket] Question about hash table efficiency
>> >
>> > Hi list, just discovered racket and having fun learning about scheme
>> > again. Haven't used it since undergrad. Thanks!
>> >
>> > I was wondering about the efficiency of hash-update vs. hash-set and
>> > hash-ref. I thought the former would be faster, otherwise why have it?
>> > A little benchmark script reveals the same surprising result as my
>> > actual application though:
>> >
>> > """
>> > #lang racket
>> >
>> > ;; warm-up
>> > (time (void
>> >  (for/fold ([table (make-immutable-hasheq '())])
>> >   ([ii (in-range 100000)])
>> >   (hash-update table (modulo ii 100) add1 ii))))
>> >
>> > ;; update
>> > (newline)
>> > (time (void
>> >  (for/fold ([table (make-immutable-hasheq '())])
>> >   ([ii (in-range 100000)])
>> >   (hash-update table (modulo ii 100) add1 ii))))
>> >
>> > ;; set/ref
>> > (newline)
>> > (time (void
>> >  (for/fold ([table (make-immutable-hasheq '())])
>> >   ([ii (in-range 100000)])
>> >   (hash-set table (modulo ii 100) (add1 (hash-ref table (modulo ii 100)
>> > ii))))))
>> >
>> > """
>> >
>> > This prints out:
>> >
>> > cpu time: 112 real time: 114 gc time: 32
>> >
>> > cpu time: 80 real time: 82 gc time: 0
>> >
>> > cpu time: 76 real time: 75 gc time: 4
>> >
>> >
>> > My original application used mutable hash tables and the same effect
>> > was observed (if not more dramatically). Any thoughts on why the
>> > update form is slower? I'm running this in Racket v5.1.1.
>> >
>> > - James
>> > ____________________
>> >   Racket Users list:
>> >   http://lists.racket-lang.org/users
>> >
>>
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>
>

Posted on the users mailing list.