[plt-scheme] help on how to write a frequency-counting function in a more functional way

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sun May 3 15:15:07 EDT 2009

At Fri, 1 May 2009 13:20:39 -0400, Doug Orleans wrote:
> On Wed, Apr 29, 2009 at 5:53 PM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> > At Tue, 28 Apr 2009 23:02:37 -0400, Doug Orleans wrote:
> >> On Sun, Apr 19, 2009 at 11:56 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> >> > The performance is close to the binary-tree performance, probably
> >> > because that's how functional hash tables are implemented internally
> >> > (but as red-black trees, so they're always balanced).
> >>
> >> I had been naively assuming that immutable hash tables were as
> >> efficient as mutable ones.  It would be good to say more about their
> >> relative performance in the reference manual than just "immutable
> >> tables support constant-time functional update."
> >
> > Ok, I extended the note there.
> (There's a "the the" in the extended note.)  I guess my main question
> was, how do hash-remove and hash-remove! compare? 

The difference would be similar to adding to the table: the functional
one will be slower.

> And, come to think
> of it, I don't understand how hash-ref and hash-set are constant-time
> if they use red-black trees.

Ok, it's actually O(log N) for N elements. But since all machines
support much less than 2^64 objects, we just call it constant.

Here are the times on my machine for building a table of 32 fixnums
50000 times through building a table of 320000 fixnums 5 times:

 cpu time: 1539 real time: 1548 gc time: 111
 cpu time: 2442 real time: 2457 gc time: 197
 cpu time: 3535 real time: 3554 gc time: 387
 cpu time: 4851 real time: 4865 gc time: 785
 cpu time: 5872 real time: 5884 gc time: 902

Compared to mutable hash tables:

 cpu time: 563 real time: 565 gc time: 18
 cpu time: 507 real time: 509 gc time: 19
 cpu time: 664 real time: 666 gc time: 27
 cpu time: 640 real time: 642 gc time: 80
 cpu time: 725 real time: 726 gc time: 209

In the immutable case, it clearly takes longer for the same number of
addition with more items in each table, and the difference is bigger
than GC and collision effects. But the difference over x10,000 items is
still at a scale that I would could call an "optimization" for small
tables. :)

Posted on the users mailing list.