[racket-dev] Wow; racket master at least 2x faster than 5.3.1 on my rb tree benchmark?

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Mon Nov 19 08:12:40 EST 2012

Score another one for random testing! :)

On Sun, Nov 18, 2012 at 10:26 PM, Danny Yoo <dyoo at hashcollision.org> wrote:
>
>
> On Sun, Nov 18, 2012 at 4:24 PM, Pierpaolo Bernardi <olopierpa at gmail.com>
> wrote:
>>
>> How does compare to builtin mutable hashes?
>
>
>
> The following code represents a rough hashtable equivalent of what my rb
> code would be enabling (quick search for word by position):
>
>
>       ;; We might be curious as to the overhead of the tree structure.
>       ;; (of course, it's worth it because we are dealing with a dynamic set
> here.)
>       ;; Still, let's compare with inserting into a native hash:
>       (printf "just for comparison: inserting in a native hash:\n")
>       (let ([ht (make-hash)])
>         (time
>           (for/fold ([acc 0]) ([word (in-list (force all-words))])
>             (hash-set! ht acc word)
>             (+ acc (string-length word)))))
>
>
> It's also useful to compare this vs the existing splay tree approach in
> syntax-color/token-tree:
>
>       (printf "just for comparison: inserting in the original token
> tree:\n")
>       (let ([t (new token-tree%)])
>         (time
>           (for ([word (in-list (force all-words))])
>             (insert-last-spec! t (string-length word) word))))
>
>
> Here's the output of the insertion benchmark:
>
>     ;; (from the rb-tree insertion)
>     inserting 235886 words at the end...
>     cpu time: 204 real time: 205 gc time: 0
>
>     just for comparison: inserting in a native hash:
>     cpu time: 108 real time: 107 gc time: 0
>
>     just for comparison: inserting in the original token tree:
>     cpu time: 51 real time: 51 gc time: 0
>
>
> So compared to the rb-tree version, insertions into the hashtable are about
> twice as fast.  And as one might expect, the splay tree bulk insertion is
> the fastest: it doesn't deal with balance at insertion time and can it delay
> that work until we start searching the structure.
>
> The rb-tree (as well as the original splay code) allows for much more
> flexible searches and dynamic updates into the sequence structure than the
> hash, so it's definitely worth the extra complexity.  My conjecture is that
> the non-allocating nature of the rb-tree, as well as its always-balanced
> structure, will be worth the cost of the extra insertion time vs splays.  I
> just hope I can show it!  :)  I'll see tomorrow when I code up a token-tree%
> implementation and start measuring times in DrRacket.
>
> I just got done with the fundamental rb-tree data structures this evening.
> Thank goodness Robby strongly recommended me to write a randomized testing
> approach.  It caught a lot of things I hadn't even considered.
>
> _________________________
>   Racket Developers list:
>   http://lists.racket-lang.org/dev
>

Posted on the dev mailing list.