[plt-scheme] please help with generating random permutation

From: Will M Farr (farr at MIT.EDU)
Date: Thu Sep 10 16:54:58 EDT 2009

Sigrid,

Addressing your question

On Sep 10, 2009, at 4:31 PM, keydana at gmx.de wrote:

> Will, thanks for explaining the statistics. It's a stupid thing to  
> ask, but... to check with an appropriate sample I really have to re- 
> execute the script independently each time, correct? Because if I do  
> it in a loop like I tried out quickly, the pseudo random number  
> generator will use it's internal state, whereas what I need to know  
> is the outcome on each first shuffle only? Sorry for asking such a  
> simple thing, really...

No, you do not have to restart the script for each sample.  The pseudo- 
random number generation used by (random) and friends in PLT Scheme is  
quite good (google "L’Ecuyer’s MRG32k3a" for more information).  One  
of the (many) properties that good generators have is that their  
outputs after each call to (random) are statistically un-correlated.   
In general, it is impossible to tell via statistical tests whether a  
generator has been randomly re-seeded between outputs, or is  
continuing the same stream of random numbers (at least until you are  
approaching the "period" of the generator, at which point the stream  
repeats---the period of MRG32k3a is approximately 2^91, so this is not  
an issue for you).  So, you should have no problem with looping within  
the script, instead of re-running it for each sample.

In fact, it could be worse to re-run the script for each sample: note  
that I say above *randomly* re-seeded (e.g. by reading some bytes  
from /dev/random or other true random source to generate an integer  
seed).  If you run the script many times in a row, supposedly  
independently, it is possible (though unlikely) that several of the  
runs will take place within one millisecond, in which case the default  
seed for the random number generator of (current-milliseconds) will  
result in the *same* stream of random numbers across the different  
runs---not what you want.  I usually re-seed the generator using code  
like:

(with-input-from-file "/dev/random"
   (lambda ()
     (let loop ()
       (let ((n (integer-bytes->integer
                 (read-bytes 4)
                 #f))) ; #f = unsigned
         (if (not (and (>= n 1)
                       (<= n (sub1 (expt 2 31)))))
             (loop) ; Invalid argument for (random-seed n)
             (random-seed n))))))

before I begin sampling from (random).  (If you're not familiar with  
unix/macosx systems, /dev/random is a "file" that uses things like the  
precise number of clock ticks between keystrokes, I/O interrupts,  
network interrupts, etc, to provide a physical source of randomness.)

By the way, one of the tests in the classic test suite for pseudo- 
random number generators called DIEHARD involves shuffling cards and  
looking for correlations in the resulting decks, which is essentially  
what you are doing.  The PRNG used by PLT Scheme passes this test (and  
the others in the DIEHARD) suite with flying colors.

Good luck!

Will
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 203 bytes
Desc: This is a digitally signed message part
URL: <http://lists.racket-lang.org/users/archive/attachments/20090910/5c23e984/attachment.sig>

Posted on the users mailing list.