[racket] random big nat, please

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Thu Apr 11 15:18:08 EDT 2013

FWIW, I ended up with this function:

(define (pick-a-maze maze-count)
  (+ (if (zero? (random 2))
         (/ maze-count 2)
         0)
     (random-natural (/ maze-count 4))
     (random-natural (/ maze-count 4))))

which seems to be working out great.

Robby


On Thu, Apr 11, 2013 at 12:53 PM, Joe Gilray <jgilray at gmail.com> wrote:

> Really fun stuff.  Thanks Neil.
>
> -Joe
>
>
> On Mon, Apr 8, 2013 at 2:10 PM, Robby Findler <robby at eecs.northwestern.edu
> > wrote:
>
>> Thanks! What a great message!
>>
>> Robby
>>
>>
>> On Mon, Apr 8, 2013 at 2:47 PM, Neil Toronto <neil.toronto at gmail.com>wrote:
>>
>>> Moving this to the list because it seems general-interest-y.
>>>
>>>
>>> On 04/02/2013 11:08 AM, Robby Findler wrote:
>>>
>>>> Hi Neil; any advice on how to pick, at random, an natural number less
>>>> than 604707155844263340879799987806**365649019991479003765974057968**
>>>> 835437533310870017962569786982**4,
>>>> or other large integers?
>>>>
>>>> I wouldn't mind uniformly random (but if I get to make wishes, I'd like
>>>> to avoid small numbers, large numbers, and numbers near n/2).
>>>>
>>>> I tried looking in the finite distribution and integer distribution
>>>> sections of the math library but didn't see anything.
>>>>
>>>
>>> It's not in `math/distributions' because that module doesn't have
>>> integer distributions yet. (Well, it does, in the style of R, where the
>>> "integers" are actually flonums. They're fast, but probably not exact
>>> enough, or may not return random numbers big enough, for what you want them
>>> for.)
>>>
>>> The random big integer functions are sort of hidden in `math/base', as
>>> `random-natural', `random-integer' and `random-bits'. The last is the
>>> fastest; use it if you want a natural number less than some power of 2.
>>>
>>> The distributions for those are uniform, so to get your wishes, you'll
>>> have to be a bit tricky. Here's one way: add them. If you add some number p
>>> of uniformly distributed numbers, as p increases, the distribution of the
>>> sum approaches a bell shape, which sounds like what you want. For example,
>>> adding two results in a triangular distribution:
>>>
>>> #lang racket
>>>
>>> (require plot)
>>>
>>> (define is (build-list 20000 (λ (_) (random 100))))
>>> (define js (build-list 20000 (λ (_) (random 100))))
>>> (plot (density (map + is js)))
>>>
>>>
>>> Here's a more general function that chooses large natural numbers < n by
>>> adding some number of uniformly random naturals:
>>>
>>> (require math)
>>>
>>> (define random-natural-parts 3)
>>>
>>> (define (my-random-natural n)
>>>   (define p (min n random-natural-parts))
>>>   (define m (quotient n p))
>>>   (define m-last (- n (* m (- p 1))))
>>>   (+ (apply + (build-list (- p 1) (λ (_) (random-natural (+ m 1)))))
>>>      (random-natural m-last)))
>>>
>>> (define xs
>>>   (build-list 20000 (λ (_) (my-random-natural
>>>                             151223462345987123498759238861**3))))
>>> (plot (density xs))
>>> (printf "max xs = ~v~n" (apply max xs))
>>> (printf "min xs = ~v~n" (apply min xs))
>>>
>>>
>>> It should work for all n >= 1. Use `random-natural-parts' to control the
>>> shape: 1 for uniform, 2 for triangular, 3 for piecewise quadratic, etc.,
>>> etc. The higher this number is, the less probable the numbers near 0 and n
>>> will be.
>>>
>>> --
>>>
>>> To probabilistically avoid some kinds of numbers, you could use
>>> rejection sampling. If you wanted all powers of two to be 1/4 the
>>> probability they normally would be, you could accept them with probability
>>> 1/4 like this:
>>>
>>> (define (my-random-natural* n)
>>>   (define x (my-random-natural n))
>>>   (cond [(or (not (power-of-two? x)) ((random) . < . 1/4))  x]
>>>         [else  (my-random-natural* n)]))
>>>
>>> ;; Plot a histogram
>>> (define-values (ys cs)
>>>   (let*-values
>>>       ([(ys)     (build-list 20000 (λ (_) (my-random-natural* 40)))]
>>>        [(ys cs)  (count-samples ys)])
>>>     (sort-samples < ys cs)))
>>>
>>> (plot (discrete-histogram (map list ys cs)))
>>>
>>>
>>> Here, "rejection" means trying again, since you always want an answer.
>>>
>>> To avoid numbers *near* a power of two, make the acceptance probability
>>> a function of the distance to the nearest power of two. We'll need a
>>> "nearest power of two" function first:
>>>
>>> (define (floor-log2 x)
>>>   (- (integer-length x) 1))
>>>
>>> (define (ceiling-log2 x)
>>>   (integer-length (- x 1)))
>>>
>>> (define (nearest-pow2 x)
>>>   (cond [(zero? x)  1]
>>>         [else
>>>          (define x0 (expt 2 (floor-log2 x)))
>>>          (define x1 (expt 2 (ceiling-log2 x)))
>>>          (if ((abs (- x x0)) . < . (abs (- x x1))) x0 x1)]))
>>>
>>>
>>> Then make and use a function that returns acceptance probabilities:
>>>
>>> (define pow2-prob 1/4)
>>> (define pow2-spread 4)
>>>
>>> (define (accept-prob x)
>>>   (define pow2 (nearest-pow2 x))
>>>   (define d (min 1 (/ (abs (- x pow2)) pow2-spread)))
>>>   ;; A convex combination of 1 and pow2-prob, with fraction d
>>>   (+ d (* (- 1 d) pow2-prob)))
>>>
>>> (define (my-random-natural* n)
>>>   (define x (my-random-natural n))
>>>   (cond [((random) . < . (accept-prob x))  x]
>>>         [else  (my-random-natural* n)]))
>>>
>>>
>>> Here, `pow2-prob' is the probability a power of two is accepted, and
>>> `pow2-spread' is the minimum distance an x can be from a power of two to
>>> always be accepted.
>>>
>>> > (map accept-prob '(16 17 18 19 20 21))
>>> '(1/4 7/16 5/8 13/16 1 1)
>>>
>>> The `accept-prob' function can be anything, as long as it returns
>>> probabilities, and doesn't return small probabilities so often that
>>> `my-random-natural*' spends a long time looping.
>>>
>>> Neil ⊥
>>>
>>>
>>
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130411/2f62dd56/attachment.html>

Posted on the users mailing list.