[racket-dev] Racket hash tables vs. Python dicts - Performance

From: Laurent (laurent.orseau at gmail.com)
Date: Fri Jul 25 04:34:19 EDT 2014

Even more idiomatic:
(time (for ([w words])
        (hash-update! d w add1 0)))

But it seems to be slightly slower that Robby's version.

Laurent


On Thu, Jul 24, 2014 at 11:26 AM, Robby Findler <robby at eecs.northwestern.edu
> wrote:

> Not an answer to your direction question, but this is the more
> idiomatic way to write that and it seems to be a bit faster:
>
> (time (for ([w (in-list words)])
>         (hash-set! d w (add1 (hash-ref d w 0)))))
>
> On Wed, Jul 23, 2014 at 10:54 AM, Pedro Ramos
> <pedropramos at tecnico.ulisboa.pt> wrote:
> > Hi,
> >
> > I've been developing an implementation of Python in Racket, where I'm
> > implementing Python's dictionaries over Racket custom hash tables.
> >
> > While my occasional benchmarks typically show better performance on
> Racket
> > programs than their Python equivalents, Racket's hash tables generally
> seem
> > to be slower than Python's dicts.
> >
> > I've set up this benchmark in Racket:
> >
> >
> > #lang racket
> >
> > (define alphabet "abcdefghijklmnopqrstuvwxyz")
> > (define (random-word n)
> >   (build-string n (lambda (x) (string-ref alphabet (random 23)))))
> >
> > (define words (for/list ([k 1000000])
> >                 (random-word 3)))
> > (define d (make-hash))
> >
> > (time (for ([w words])
> >         (if (hash-has-key? d w)
> >             (hash-set! d w (add1 (hash-ref d w)))
> >             (hash-set! d w 1))))
> >
> >
> > And its equivalent in Python:
> >
> >
> > import random
> > import time
> >
> > alphabet = "abcdefghijklmnopqrstuvwxyz"
> > def random_word(n):
> >   return ''.join([random.choice(alphabet) for i in range(n)])
> >
> > words = [random_word(3) for k in xrange(1000000)]
> > d = {}
> >
> > a = time.time()
> > for w in words:
> >   if w in d:
> >     d[w] = d[w] + 1
> >   else:
> >     d[w] = 1
> > b = time.time()
> > print b-a, 'seconds'
> >
> >
> > The Racket example yields running times of around 500 ms (running on
> Racket
> > v6.0.1) while the Python example yields running times of around 330 ms
> > (running on Python 2.7.3).
> >
> > I find this unusual because Python is somewhat more dynamic than Racket,
> > since
> > (a) Python's equality and hashing functions have to dispatched at runtime
> > for each key;
> > (b) referencing and setting values in a Python dict is done using a very
> > general operator, [], whose behaviour also has to be dispatched at
> runtime,
> > unlike the more specific hash-ref and hash-set! Racket functions.
> >
> > Is there something I'm missing about Racket's hash tables which explains
> > this slower speed?
> >
> > Thanks,
> > Pedro Ramos
> >
> > _________________________
> >   Racket Developers list:
> >   http://lists.racket-lang.org/dev
> >
> _________________________
>   Racket Developers list:
>   http://lists.racket-lang.org/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/dev/archive/attachments/20140725/ae47adb1/attachment-0001.html>

Posted on the dev mailing list.