[racket-dev] Wow; racket master at least 2x faster than 5.3.1 on my rb tree benchmark?
Score another one for random testing! :)
On Sun, Nov 18, 2012 at 10:26 PM, Danny Yoo <dyoo at hashcollision.org> wrote:
>
>
> On Sun, Nov 18, 2012 at 4:24 PM, Pierpaolo Bernardi <olopierpa at gmail.com>
> wrote:
>>
>> How does compare to builtin mutable hashes?
>
>
>
> The following code represents a rough hashtable equivalent of what my rb
> code would be enabling (quick search for word by position):
>
>
> ;; We might be curious as to the overhead of the tree structure.
> ;; (of course, it's worth it because we are dealing with a dynamic set
> here.)
> ;; Still, let's compare with inserting into a native hash:
> (printf "just for comparison: inserting in a native hash:\n")
> (let ([ht (make-hash)])
> (time
> (for/fold ([acc 0]) ([word (in-list (force all-words))])
> (hash-set! ht acc word)
> (+ acc (string-length word)))))
>
>
> It's also useful to compare this vs the existing splay tree approach in
> syntax-color/token-tree:
>
> (printf "just for comparison: inserting in the original token
> tree:\n")
> (let ([t (new token-tree%)])
> (time
> (for ([word (in-list (force all-words))])
> (insert-last-spec! t (string-length word) word))))
>
>
> Here's the output of the insertion benchmark:
>
> ;; (from the rb-tree insertion)
> inserting 235886 words at the end...
> cpu time: 204 real time: 205 gc time: 0
>
> just for comparison: inserting in a native hash:
> cpu time: 108 real time: 107 gc time: 0
>
> just for comparison: inserting in the original token tree:
> cpu time: 51 real time: 51 gc time: 0
>
>
> So compared to the rb-tree version, insertions into the hashtable are about
> twice as fast. And as one might expect, the splay tree bulk insertion is
> the fastest: it doesn't deal with balance at insertion time and can it delay
> that work until we start searching the structure.
>
> The rb-tree (as well as the original splay code) allows for much more
> flexible searches and dynamic updates into the sequence structure than the
> hash, so it's definitely worth the extra complexity. My conjecture is that
> the non-allocating nature of the rb-tree, as well as its always-balanced
> structure, will be worth the cost of the extra insertion time vs splays. I
> just hope I can show it! :) I'll see tomorrow when I code up a token-tree%
> implementation and start measuring times in DrRacket.
>
> I just got done with the fundamental rb-tree data structures this evening.
> Thank goodness Robby strongly recommended me to write a randomized testing
> approach. It caught a lot of things I hadn't even considered.
>
> _________________________
> Racket Developers list:
> http://lists.racket-lang.org/dev
>