[racket-dev] Racket hash tables vs. Python dicts - Performance
Hi,
I've been developing an implementation of Python in Racket, where I'm
implementing Python's dictionaries over Racket custom hash tables.
While my occasional benchmarks typically show better performance on Racket
programs than their Python equivalents, Racket's hash tables generally seem
to be slower than Python's dicts.
I've set up this benchmark in Racket:
#lang racket
(define alphabet "abcdefghijklmnopqrstuvwxyz")
(define (random-word n)
  (build-string n (lambda (x) (string-ref alphabet (random 23)))))
(define words (for/list ([k 1000000])
                (random-word 3)))
(define d (make-hash))
(time (for ([w words])
        (if (hash-has-key? d w)
            (hash-set! d w (add1 (hash-ref d w)))
            (hash-set! d w 1))))
And its equivalent in Python:
import random
import time
alphabet = "abcdefghijklmnopqrstuvwxyz"
def random_word(n):
  return ''.join([random.choice(alphabet) for i in range(n)])
words = [random_word(3) for k in xrange(1000000)]
d = {}
a = time.time()
for w in words:
  if w in d:
    d[w] = d[w] + 1
  else:
    d[w] = 1
b = time.time()
print b-a, 'seconds'
The Racket example yields running times of around 500 ms (running on Racket
v6.0.1) while the Python example yields running times of around 330 ms
(running on Python 2.7.3).
I find this unusual because Python is somewhat more dynamic than Racket,
since
(a) Python's equality and hashing functions have to dispatched at runtime
for each key;
(b) referencing and setting values in a Python dict is done using a very
general operator, [], whose behaviour also has to be dispatched at runtime,
unlike the more specific hash-ref and hash-set! Racket functions.
Is there something I'm missing about Racket's hash tables which explains
this slower speed?
Thanks,
Pedro Ramos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/dev/archive/attachments/20140723/e427a2c0/attachment.html>