[plt-scheme] help on how to write a frequency-counting function in a more functional way

From: David Van Horn (dvanhorn at ccs.neu.edu)
Date: Mon Apr 20 13:13:10 EDT 2009

Matthias Felleisen wrote:
> As soon as you have 100,000 frequencies in an approximate range of 
> 20,000 steps, the functional solutions don't look that good compared to 
> vectors. With 1,000,000 it's unquestionable. I bet you can repeat this 
> experiment in C and get similar results. In a sparse world, the results 
> are indifferent.
> 
> I will leave it to Sigrid to play with imperative hashes and balanced 
> trees and who knows what.
> 
> The code is attached. It should be easy to play with other parameters of 
> the experiment

Here's one more using functional random-access lists:

(require (prefix-in ra: (planet dvanhorn/ralist:1:7)))

(define (cnt-ra f g)
   (cnt-fp
    f g
    (lambda () (ra:make-list (- HIGH LOW) 0))
    (lambda (ls freq)
      (ra:list-update ls (- freq LOW) add1))
    (lambda (ls)
      (out
       g (lambda () (for ((v (ra:in-list ls))
                          (i (in-range (- HIGH LOW))))
                      (unless (= v 0) (printf F (+ 80 i) v))))))))

[Add (test "ralist:" cnt-ra "tmp-ra") to `experement'.]

For large sizes, the performance is worse than hash tables but better 
than BSTs.

David

1000 @ vector: cpu time: 18 real time: 29 gc time: 0
1000 @ a list: cpu time: 16 real time: 19 gc time: 0
1000 @ bst   : cpu time: 25 real time: 27 gc time: 0
1000 @ hash  : cpu time: 13 real time: 13 gc time: 0
1000 @ ralist: cpu time: 30 real time: 31 gc time: 0

10000 @ vector: cpu time: 94 real time: 102 gc time: 0
10000 @ a list: cpu time: 93 real time: 96 gc time: 0
10000 @ bst   : cpu time: 196 real time: 200 gc time: 0
10000 @ hash  : cpu time: 108 real time: 111 gc time: 0
10000 @ ralist: cpu time: 147 real time: 151 gc time: 0

100000 @ vector: cpu time: 540 real time: 560 gc time: 0
100000 @ a list: cpu time: 740 real time: 763 gc time: 0
100000 @ bst   : cpu time: 1967 real time: 2014 gc time: 51
100000 @ hash  : cpu time: 1036 real time: 1160 gc time: 96
100000 @ ralist: cpu time: 1217 real time: 1255 gc time: 48

1000000 @ vector: cpu time: 4373 real time: 4475 gc time: 0
1000000 @ a list: cpu time: 9155 real time: 9357 gc time: 1768
1000000 @ bst   : cpu time: 19721 real time: 20158 gc time: 967
1000000 @ hash  : cpu time: 9978 real time: 10271 gc time: 1269
1000000 @ ralist: cpu time: 11272 real time: 11537 gc time: 635


Posted on the users mailing list.