[racket] random big nat, please

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Mon Apr 8 17:10:51 EDT 2013

Thanks! What a great message!

Robby


On Mon, Apr 8, 2013 at 2:47 PM, Neil Toronto <neil.toronto at gmail.com> wrote:

> Moving this to the list because it seems general-interest-y.
>
>
> On 04/02/2013 11:08 AM, Robby Findler wrote:
>
>> Hi Neil; any advice on how to pick, at random, an natural number less
>> than 604707155844263340879799987806**365649019991479003765974057968**
>> 835437533310870017962569786982**4,
>> or other large integers?
>>
>> I wouldn't mind uniformly random (but if I get to make wishes, I'd like
>> to avoid small numbers, large numbers, and numbers near n/2).
>>
>> I tried looking in the finite distribution and integer distribution
>> sections of the math library but didn't see anything.
>>
>
> It's not in `math/distributions' because that module doesn't have integer
> distributions yet. (Well, it does, in the style of R, where the "integers"
> are actually flonums. They're fast, but probably not exact enough, or may
> not return random numbers big enough, for what you want them for.)
>
> The random big integer functions are sort of hidden in `math/base', as
> `random-natural', `random-integer' and `random-bits'. The last is the
> fastest; use it if you want a natural number less than some power of 2.
>
> The distributions for those are uniform, so to get your wishes, you'll
> have to be a bit tricky. Here's one way: add them. If you add some number p
> of uniformly distributed numbers, as p increases, the distribution of the
> sum approaches a bell shape, which sounds like what you want. For example,
> adding two results in a triangular distribution:
>
> #lang racket
>
> (require plot)
>
> (define is (build-list 20000 (λ (_) (random 100))))
> (define js (build-list 20000 (λ (_) (random 100))))
> (plot (density (map + is js)))
>
>
> Here's a more general function that chooses large natural numbers < n by
> adding some number of uniformly random naturals:
>
> (require math)
>
> (define random-natural-parts 3)
>
> (define (my-random-natural n)
>   (define p (min n random-natural-parts))
>   (define m (quotient n p))
>   (define m-last (- n (* m (- p 1))))
>   (+ (apply + (build-list (- p 1) (λ (_) (random-natural (+ m 1)))))
>      (random-natural m-last)))
>
> (define xs
>   (build-list 20000 (λ (_) (my-random-natural
>                             151223462345987123498759238861**3))))
> (plot (density xs))
> (printf "max xs = ~v~n" (apply max xs))
> (printf "min xs = ~v~n" (apply min xs))
>
>
> It should work for all n >= 1. Use `random-natural-parts' to control the
> shape: 1 for uniform, 2 for triangular, 3 for piecewise quadratic, etc.,
> etc. The higher this number is, the less probable the numbers near 0 and n
> will be.
>
> --
>
> To probabilistically avoid some kinds of numbers, you could use rejection
> sampling. If you wanted all powers of two to be 1/4 the probability they
> normally would be, you could accept them with probability 1/4 like this:
>
> (define (my-random-natural* n)
>   (define x (my-random-natural n))
>   (cond [(or (not (power-of-two? x)) ((random) . < . 1/4))  x]
>         [else  (my-random-natural* n)]))
>
> ;; Plot a histogram
> (define-values (ys cs)
>   (let*-values
>       ([(ys)     (build-list 20000 (λ (_) (my-random-natural* 40)))]
>        [(ys cs)  (count-samples ys)])
>     (sort-samples < ys cs)))
>
> (plot (discrete-histogram (map list ys cs)))
>
>
> Here, "rejection" means trying again, since you always want an answer.
>
> To avoid numbers *near* a power of two, make the acceptance probability a
> function of the distance to the nearest power of two. We'll need a "nearest
> power of two" function first:
>
> (define (floor-log2 x)
>   (- (integer-length x) 1))
>
> (define (ceiling-log2 x)
>   (integer-length (- x 1)))
>
> (define (nearest-pow2 x)
>   (cond [(zero? x)  1]
>         [else
>          (define x0 (expt 2 (floor-log2 x)))
>          (define x1 (expt 2 (ceiling-log2 x)))
>          (if ((abs (- x x0)) . < . (abs (- x x1))) x0 x1)]))
>
>
> Then make and use a function that returns acceptance probabilities:
>
> (define pow2-prob 1/4)
> (define pow2-spread 4)
>
> (define (accept-prob x)
>   (define pow2 (nearest-pow2 x))
>   (define d (min 1 (/ (abs (- x pow2)) pow2-spread)))
>   ;; A convex combination of 1 and pow2-prob, with fraction d
>   (+ d (* (- 1 d) pow2-prob)))
>
> (define (my-random-natural* n)
>   (define x (my-random-natural n))
>   (cond [((random) . < . (accept-prob x))  x]
>         [else  (my-random-natural* n)]))
>
>
> Here, `pow2-prob' is the probability a power of two is accepted, and
> `pow2-spread' is the minimum distance an x can be from a power of two to
> always be accepted.
>
> > (map accept-prob '(16 17 18 19 20 21))
> '(1/4 7/16 5/8 13/16 1 1)
>
> The `accept-prob' function can be anything, as long as it returns
> probabilities, and doesn't return small probabilities so often that
> `my-random-natural*' spends a long time looping.
>
> Neil ⊥
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130408/699510ec/attachment-0001.html>

Posted on the users mailing list.