[racket-dev] Racket hash tables vs. Python dicts - Performance

From: Pedro Ramos (pedropramos at tecnico.ulisboa.pt)
Date: Fri Jul 25 08:51:22 EDT 2014

Thanks for pointing that out. It was a mistake indeed.
Interestingly enough, changing it from (random 23) to (random 26) resulted
in a slight increase in running time from 500 ms to 530 ms.


On Fri, Jul 25, 2014 at 1:39 PM, Jake Eakle <jseakle at gmail.com> wrote:

> This is probably not responsible for much of the performance difference,
> but the racket code has (random 23) where it looks like it ought to be
> (random 26), resulting in a smaller space of possible random keys. This
> will cause the branch with the lookup to happen somewhat more often than in
> the python version.
>
>
> On Fri, Jul 25, 2014 at 1:34 AM, Laurent <laurent.orseau at gmail.com> wrote:
>
>> Even more idiomatic:
>> (time (for ([w words])
>>         (hash-update! d w add1 0)))
>>
>> But it seems to be slightly slower that Robby's version.
>>
>> Laurent
>>
>>
>> On Thu, Jul 24, 2014 at 11:26 AM, Robby Findler <
>> robby at eecs.northwestern.edu> wrote:
>>
>>> Not an answer to your direction question, but this is the more
>>> idiomatic way to write that and it seems to be a bit faster:
>>>
>>> (time (for ([w (in-list words)])
>>>         (hash-set! d w (add1 (hash-ref d w 0)))))
>>>
>>> On Wed, Jul 23, 2014 at 10:54 AM, Pedro Ramos
>>> <pedropramos at tecnico.ulisboa.pt> wrote:
>>> > Hi,
>>> >
>>> > I've been developing an implementation of Python in Racket, where I'm
>>> > implementing Python's dictionaries over Racket custom hash tables.
>>> >
>>> > While my occasional benchmarks typically show better performance on
>>> Racket
>>> > programs than their Python equivalents, Racket's hash tables generally
>>> seem
>>> > to be slower than Python's dicts.
>>> >
>>> > I've set up this benchmark in Racket:
>>> >
>>> >
>>> > #lang racket
>>> >
>>> > (define alphabet "abcdefghijklmnopqrstuvwxyz")
>>> > (define (random-word n)
>>> >   (build-string n (lambda (x) (string-ref alphabet (random 23)))))
>>> >
>>> > (define words (for/list ([k 1000000])
>>> >                 (random-word 3)))
>>> > (define d (make-hash))
>>> >
>>> > (time (for ([w words])
>>> >         (if (hash-has-key? d w)
>>> >             (hash-set! d w (add1 (hash-ref d w)))
>>> >             (hash-set! d w 1))))
>>> >
>>> >
>>> > And its equivalent in Python:
>>> >
>>> >
>>> > import random
>>> > import time
>>> >
>>> > alphabet = "abcdefghijklmnopqrstuvwxyz"
>>> > def random_word(n):
>>> >   return ''.join([random.choice(alphabet) for i in range(n)])
>>> >
>>> > words = [random_word(3) for k in xrange(1000000)]
>>> > d = {}
>>> >
>>> > a = time.time()
>>> > for w in words:
>>> >   if w in d:
>>> >     d[w] = d[w] + 1
>>> >   else:
>>> >     d[w] = 1
>>> > b = time.time()
>>> > print b-a, 'seconds'
>>> >
>>> >
>>> > The Racket example yields running times of around 500 ms (running on
>>> Racket
>>> > v6.0.1) while the Python example yields running times of around 330 ms
>>> > (running on Python 2.7.3).
>>> >
>>> > I find this unusual because Python is somewhat more dynamic than
>>> Racket,
>>> > since
>>> > (a) Python's equality and hashing functions have to dispatched at
>>> runtime
>>> > for each key;
>>> > (b) referencing and setting values in a Python dict is done using a
>>> very
>>> > general operator, [], whose behaviour also has to be dispatched at
>>> runtime,
>>> > unlike the more specific hash-ref and hash-set! Racket functions.
>>> >
>>> > Is there something I'm missing about Racket's hash tables which
>>> explains
>>> > this slower speed?
>>> >
>>> > Thanks,
>>> > Pedro Ramos
>>> >
>>> > _________________________
>>> >   Racket Developers list:
>>> >   http://lists.racket-lang.org/dev
>>> >
>>> _________________________
>>>   Racket Developers list:
>>>   http://lists.racket-lang.org/dev
>>>
>>
>>
>> _________________________
>>   Racket Developers list:
>>   http://lists.racket-lang.org/dev
>>
>>
>
>
> --
> A warb degombs the brangy. Your gitch zanks and leils the warb.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/dev/archive/attachments/20140725/2b94bccd/attachment.html>

Posted on the dev mailing list.