[racket] random big nat, please

From: Joe Gilray (jgilray at gmail.com)
Date: Thu Apr 11 13:53:48 EDT 2013

Really fun stuff.  Thanks Neil.

-Joe


On Mon, Apr 8, 2013 at 2:10 PM, Robby Findler
<robby at eecs.northwestern.edu>wrote:

> Thanks! What a great message!
>
> Robby
>
>
> On Mon, Apr 8, 2013 at 2:47 PM, Neil Toronto <neil.toronto at gmail.com>wrote:
>
>> Moving this to the list because it seems general-interest-y.
>>
>>
>> On 04/02/2013 11:08 AM, Robby Findler wrote:
>>
>>> Hi Neil; any advice on how to pick, at random, an natural number less
>>> than 604707155844263340879799987806**365649019991479003765974057968**
>>> 835437533310870017962569786982**4,
>>> or other large integers?
>>>
>>> I wouldn't mind uniformly random (but if I get to make wishes, I'd like
>>> to avoid small numbers, large numbers, and numbers near n/2).
>>>
>>> I tried looking in the finite distribution and integer distribution
>>> sections of the math library but didn't see anything.
>>>
>>
>> It's not in `math/distributions' because that module doesn't have integer
>> distributions yet. (Well, it does, in the style of R, where the "integers"
>> are actually flonums. They're fast, but probably not exact enough, or may
>> not return random numbers big enough, for what you want them for.)
>>
>> The random big integer functions are sort of hidden in `math/base', as
>> `random-natural', `random-integer' and `random-bits'. The last is the
>> fastest; use it if you want a natural number less than some power of 2.
>>
>> The distributions for those are uniform, so to get your wishes, you'll
>> have to be a bit tricky. Here's one way: add them. If you add some number p
>> of uniformly distributed numbers, as p increases, the distribution of the
>> sum approaches a bell shape, which sounds like what you want. For example,
>> adding two results in a triangular distribution:
>>
>> #lang racket
>>
>> (require plot)
>>
>> (define is (build-list 20000 (λ (_) (random 100))))
>> (define js (build-list 20000 (λ (_) (random 100))))
>> (plot (density (map + is js)))
>>
>>
>> Here's a more general function that chooses large natural numbers < n by
>> adding some number of uniformly random naturals:
>>
>> (require math)
>>
>> (define random-natural-parts 3)
>>
>> (define (my-random-natural n)
>>   (define p (min n random-natural-parts))
>>   (define m (quotient n p))
>>   (define m-last (- n (* m (- p 1))))
>>   (+ (apply + (build-list (- p 1) (λ (_) (random-natural (+ m 1)))))
>>      (random-natural m-last)))
>>
>> (define xs
>>   (build-list 20000 (λ (_) (my-random-natural
>>                             151223462345987123498759238861**3))))
>> (plot (density xs))
>> (printf "max xs = ~v~n" (apply max xs))
>> (printf "min xs = ~v~n" (apply min xs))
>>
>>
>> It should work for all n >= 1. Use `random-natural-parts' to control the
>> shape: 1 for uniform, 2 for triangular, 3 for piecewise quadratic, etc.,
>> etc. The higher this number is, the less probable the numbers near 0 and n
>> will be.
>>
>> --
>>
>> To probabilistically avoid some kinds of numbers, you could use rejection
>> sampling. If you wanted all powers of two to be 1/4 the probability they
>> normally would be, you could accept them with probability 1/4 like this:
>>
>> (define (my-random-natural* n)
>>   (define x (my-random-natural n))
>>   (cond [(or (not (power-of-two? x)) ((random) . < . 1/4))  x]
>>         [else  (my-random-natural* n)]))
>>
>> ;; Plot a histogram
>> (define-values (ys cs)
>>   (let*-values
>>       ([(ys)     (build-list 20000 (λ (_) (my-random-natural* 40)))]
>>        [(ys cs)  (count-samples ys)])
>>     (sort-samples < ys cs)))
>>
>> (plot (discrete-histogram (map list ys cs)))
>>
>>
>> Here, "rejection" means trying again, since you always want an answer.
>>
>> To avoid numbers *near* a power of two, make the acceptance probability a
>> function of the distance to the nearest power of two. We'll need a "nearest
>> power of two" function first:
>>
>> (define (floor-log2 x)
>>   (- (integer-length x) 1))
>>
>> (define (ceiling-log2 x)
>>   (integer-length (- x 1)))
>>
>> (define (nearest-pow2 x)
>>   (cond [(zero? x)  1]
>>         [else
>>          (define x0 (expt 2 (floor-log2 x)))
>>          (define x1 (expt 2 (ceiling-log2 x)))
>>          (if ((abs (- x x0)) . < . (abs (- x x1))) x0 x1)]))
>>
>>
>> Then make and use a function that returns acceptance probabilities:
>>
>> (define pow2-prob 1/4)
>> (define pow2-spread 4)
>>
>> (define (accept-prob x)
>>   (define pow2 (nearest-pow2 x))
>>   (define d (min 1 (/ (abs (- x pow2)) pow2-spread)))
>>   ;; A convex combination of 1 and pow2-prob, with fraction d
>>   (+ d (* (- 1 d) pow2-prob)))
>>
>> (define (my-random-natural* n)
>>   (define x (my-random-natural n))
>>   (cond [((random) . < . (accept-prob x))  x]
>>         [else  (my-random-natural* n)]))
>>
>>
>> Here, `pow2-prob' is the probability a power of two is accepted, and
>> `pow2-spread' is the minimum distance an x can be from a power of two to
>> always be accepted.
>>
>> > (map accept-prob '(16 17 18 19 20 21))
>> '(1/4 7/16 5/8 13/16 1 1)
>>
>> The `accept-prob' function can be anything, as long as it returns
>> probabilities, and doesn't return small probabilities so often that
>> `my-random-natural*' spends a long time looping.
>>
>> Neil ⊥
>>
>>
>
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130411/f61e2d8b/attachment-0001.html>

Posted on the users mailing list.