[racket] Number-crunching...

From: Sam Tobin-Hochstadt (samth at cs.indiana.edu)
Date: Tue Nov 12 09:05:51 EST 2013

I now know why your custom RNG was slow.  Racket fixnums are (a)
signed and (b) have a width 1 less than the width of the host
architecture: [1]

-> (fixnum? (- (expt 2 64) 1))
#f
-> (fixnum? (- (expt 2 63) 1))
#f
-> (fixnum? (- (expt 2 62) 1))
#t

Therefore, your random number generation calculation is constantly
overflowing into bignums and allocating.

[1] http://www.cs.utah.edu/plt/snapshots/current/doc/reference/numbers.html?q=fixnum#%28tech._fixnum%29

Sam

On Tue, Nov 12, 2013 at 8:59 AM, Sam Tobin-Hochstadt
<samth at cs.indiana.edu> wrote:
> Here's a start: https://gist.github.com/samth/7431213
>
> This is a Typed Racket version of your program, converted to use
> flvectors instead of f64vectors. I haven't taken a look at the code
> for your random number generator yet, but you could try doing the
> timing by creating a large array of random numbers ahead of time, and
> then timing the loop just referencing into that array.
>
> Sam
>
> On Tue, Nov 12, 2013 at 7:52 AM, William Cushing
> <william.cushing at gmail.com> wrote:
>> I'm in the process of benchmarking various systems for their ability to
>> handle the inner loop of compilations of MCMC from a higher level
>> probabilistic modeling language (BLOG).
>>
>> Attached are examples of the kind of output (instantiations of MCMC) we'd
>> like to be able to generate given the tried-and-true Bayes Net linking the
>> probabilities of earthquake, burglary, john calling, and mary calling.
>> Classic AI fare.
>>
>> Predictably, C is fastest.  But Stalin and SBCL also put in a very fine
>> showing.
>>
>> For 50,000,000 iterations on my machine, I get the following semi-serious,
>> semi-bogus timings for a random assortment of systems (petite chez just for
>> fun):
>>
>> 1 SBCL 2.20
>> 2 Bigloo 5.25
>> 3 Chicken 11.49
>> 4 Stalin 2.15
>> 5 Racket 16.67
>> 6 Petite-Chez 30.42
>> 7 GCC 0.82
>> 8 Clang 0.92
>>
>> There is much about the comparison that isn't fair.  Chiefly, this is
>> primarily a test of speed of random number generation, but I haven't
>> controlled for the algorithm being used to generate random numbers.  Stalin
>> may very well be using GLIBC random number generation, for example.  I know
>> SBCL uses MT; in fact, the SBCL experts responded to my little encoding by
>> halving the shown runtime figure via dynamically linking in a new and
>> improved (SIMD-aware) implementation of MT.  Which is quite impressive, as
>> that puts it within 30% of my C version...and my C version is "totally
>> cheating" by employing a simple little LCPRNG.
>>
>> Porting the same generator to Racket in the obvious way more than triples
>> its running time. (!)
>> (Is it the global var?  Is the conversion to double being compiled as a real
>> division?)
>>
>> In any case, even just sticking with Racket's default PRNG, it seems the
>> performance of a simple number crunching loop leaves something to be
>> desired.  I have done what I can to try and shave off time here and there.
>> Using unsafe ops, and threading most of the state through the parameters of
>> a tail-recursive loop (thus avoiding set!), I managed to shave off 3 seconds
>> from my original encoding (so about 15%), which isn't too shabby.  Of
>> course, for even less effort, Stalin is already 800% times faster.
>>
>> I couldn't get typed/racket to type check, and /no-check doesn't seem to
>> help performance at all (not surprising).  I suspect that getting
>> typed/racket to be able to process this file might help substantially...but
>> I don't really understand its type theory enough to help it figure this
>> benchmark out.
>>
>> I though Racket experts might have some suggestions for how to convince
>> Racket to run a simple number crunching loop without quite so much in the
>> way of constant-factor slowdown.
>>
>> -Will
>>
>>
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>>

Posted on the users mailing list.