[racket] Question about hash table efficiency

From: Ray Racine (ray.racine at gmail.com)
Date: Thu Feb 14 14:35:52 EST 2013

Also on more than one occasion I've had to work around issues with entry
scaling with even mutable hashtables,  A few 100K entries, they are fine
but rapidly grind to a halt.  The Python folks have hash tables nailed,
über performance and scaling.


On Thu, Feb 14, 2013 at 2:20 PM, James Bergstra <james.bergstra at gmail.com>wrote:

> I was actually trying to understand the performance bottlenecks of the
> "knucleotide" challenge of the language shootout:
>
>
> http://benchmarksgame.alioth.debian.org/u64/program.php?test=knucleotide&lang=racket&id=4
>
> Most of the time seems to be spent in the loop that builds the hash table.
>
> (Incidentally, in that same loop, I tried replacing the set! key with
> a for/fold and that helped on my home computer but not the one at
> work. Maybe I had different versions of racket, I'm not sure)
>
> - James
>
> On Thu, Feb 14, 2013 at 2:15 PM, Robby Findler
> <robby at eecs.northwestern.edu> wrote:
> > hash-update is defined in Racket and, besides the error checking, is
> > basically the composition of those two functions. I expect it was
> written as
> > something to abstract over a common pattern, not speed things up.
> >
> > Is this a bottleneck in some real program? Perhaps you'd share that (or
> some
> > inner loop of it)? I wonder if there is a better way to speed it up than
> > this.
> >
> > Robby
> >
> >
> > On Thu, Feb 14, 2013 at 1:09 PM, J. Ian Johnson <ianj at ccs.neu.edu>
> wrote:
> >>
> >> If I am right, then it would make sense to use hash-update when the hash
> >> is big enough that a log-time traversal takes longer than a general
> >> procedure call's set-up and tear-down.
> >> -Ian
> >> ----- Original Message -----
> >> From: James Bergstra <james.bergstra at gmail.com>
> >> To: J. Ian Johnson <ianj at ccs.neu.edu>
> >> Cc: users at racket-lang.org
> >> Sent: Thu, 14 Feb 2013 13:47:51 -0500 (EST)
> >> Subject: Re: [racket] Question about hash table efficiency
> >>
> >> Thanks, I suppose I just had my hopes up. I thought the more compact
> >> form might involve fewer expression evaluations and require just one
> >> hash lookup instead of two, and therefore might run something like
> >> twice as fast, rather than slightly slower.
> >>
> >> On Thu, Feb 14, 2013 at 1:36 PM, J. Ian Johnson <ianj at ccs.neu.edu>
> wrote:
> >> > My guess would be the higher-orderness on hash-update causes the
> (minor)
> >> > slowdown.
> >> > -Ian
> >> > ----- Original Message -----
> >> > From: James Bergstra <james.bergstra at gmail.com>
> >> > To: users at racket-lang.org
> >> > Sent: Thu, 14 Feb 2013 13:16:47 -0500 (EST)
> >> > Subject: [racket] Question about hash table efficiency
> >> >
> >> > Hi list, just discovered racket and having fun learning about scheme
> >> > again. Haven't used it since undergrad. Thanks!
> >> >
> >> > I was wondering about the efficiency of hash-update vs. hash-set and
> >> > hash-ref. I thought the former would be faster, otherwise why have it?
> >> > A little benchmark script reveals the same surprising result as my
> >> > actual application though:
> >> >
> >> > """
> >> > #lang racket
> >> >
> >> > ;; warm-up
> >> > (time (void
> >> >  (for/fold ([table (make-immutable-hasheq '())])
> >> >   ([ii (in-range 100000)])
> >> >   (hash-update table (modulo ii 100) add1 ii))))
> >> >
> >> > ;; update
> >> > (newline)
> >> > (time (void
> >> >  (for/fold ([table (make-immutable-hasheq '())])
> >> >   ([ii (in-range 100000)])
> >> >   (hash-update table (modulo ii 100) add1 ii))))
> >> >
> >> > ;; set/ref
> >> > (newline)
> >> > (time (void
> >> >  (for/fold ([table (make-immutable-hasheq '())])
> >> >   ([ii (in-range 100000)])
> >> >   (hash-set table (modulo ii 100) (add1 (hash-ref table (modulo ii
> 100)
> >> > ii))))))
> >> >
> >> > """
> >> >
> >> > This prints out:
> >> >
> >> > cpu time: 112 real time: 114 gc time: 32
> >> >
> >> > cpu time: 80 real time: 82 gc time: 0
> >> >
> >> > cpu time: 76 real time: 75 gc time: 4
> >> >
> >> >
> >> > My original application used mutable hash tables and the same effect
> >> > was observed (if not more dramatically). Any thoughts on why the
> >> > update form is slower? I'm running this in Racket v5.1.1.
> >> >
> >> > - James
> >> > ____________________
> >> >   Racket Users list:
> >> >   http://lists.racket-lang.org/users
> >> >
> >>
> >> ____________________
> >>   Racket Users list:
> >>   http://lists.racket-lang.org/users
> >
> >
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130214/c781bc2a/attachment-0001.html>

Posted on the users mailing list.