[racket] Number-crunching...

From: William Cushing (william.cushing at gmail.com)
Date: Tue Nov 12 07:52:09 EST 2013

I'm in the process of benchmarking various systems for their ability to
handle the inner loop of compilations of MCMC from a higher level
probabilistic modeling language (BLOG).

Attached are examples of the kind of output (instantiations of MCMC) we'd
like to be able to generate given the tried-and-true Bayes Net linking the
probabilities of earthquake, burglary, john calling, and mary calling.
Classic AI fare.

Predictably, C is fastest.  But Stalin and SBCL also put in a very fine
showing.

For 50,000,000 iterations on my machine, I get the following semi-serious,
semi-bogus timings for a random assortment of systems (petite chez just for
fun):

1 SBCL 2.20
2 Bigloo 5.25
3 Chicken 11.49
4 Stalin 2.15
5 Racket 16.67
6 Petite-Chez 30.42
7 GCC 0.82
8 Clang 0.92

There is much about the comparison that isn't fair.  Chiefly, this is
primarily a test of speed of random number generation, but I haven't
controlled for the algorithm being used to generate random numbers.  Stalin
may very well be using GLIBC random number generation, for example.  I know
SBCL uses MT; in fact, the SBCL experts responded to my little encoding by
halving the shown runtime figure via dynamically linking in a new and
improved (SIMD-aware) implementation of MT.  Which is quite impressive, as
that puts it within 30% of my C version...and my C version is "totally
cheating" by employing a simple little LCPRNG.

Porting the same generator to Racket in the obvious way more than triples
its running time. (!)
(Is it the global var?  Is the conversion to double being compiled as a
real division?)

In any case, even just sticking with Racket's default PRNG, it seems the
performance of a simple number crunching loop leaves something to be
desired.  I have done what I can to try and shave off time here and there.
Using unsafe ops, and threading most of the state through the parameters of
a tail-recursive loop (thus avoiding set!), I managed to shave off 3
seconds from my original encoding (so about 15%), which isn't too shabby.
Of course, for even less effort, Stalin is already 800% times faster.

I couldn't get typed/racket to type check, and /no-check doesn't seem to
help performance at all (not surprising).  I suspect that getting
typed/racket to be able to process this file might help substantially...but
I don't really understand its type theory enough to help it figure this
benchmark out.

I though Racket experts might have some suggestions for how to convince
Racket to run a simple number crunching loop without quite so much in the
way of constant-factor slowdown.

-Will
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20131112/9d6e7070/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcmc.rkt
Type: application/octet-stream
Size: 5251 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20131112/9d6e7070/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcmc.stalin.scm
Type: text/x-scheme
Size: 2307 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20131112/9d6e7070/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcmc.c
Type: text/x-csrc
Size: 5258 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20131112/9d6e7070/attachment-0003.bin>

Posted on the users mailing list.