[racket] random big nat, please
This would be an awesome post for the Racket blog, plus the customary
links to the post on Twitter, G+, HN, etc.
The topic is interesting in general, as well as being about Racket in
particular. I think that combination can be an effective way for
people to discover they're interested in Racket.
On Mon, Apr 8, 2013 at 3:47 PM, Neil Toronto <neil.toronto at gmail.com> wrote:
> Moving this to the list because it seems general-interest-y.
>
> On 04/02/2013 11:08 AM, Robby Findler wrote:
>>
>> Hi Neil; any advice on how to pick, at random, an natural number less
>> than
>> 6047071558442633408797999878063656490199914790037659740579688354375333108700179625697869824,
>> or other large integers?
>>
>> I wouldn't mind uniformly random (but if I get to make wishes, I'd like
>> to avoid small numbers, large numbers, and numbers near n/2).
>>
>> I tried looking in the finite distribution and integer distribution
>> sections of the math library but didn't see anything.
>
>
> It's not in `math/distributions' because that module doesn't have integer
> distributions yet. (Well, it does, in the style of R, where the "integers"
> are actually flonums. They're fast, but probably not exact enough, or may
> not return random numbers big enough, for what you want them for.)
>
> The random big integer functions are sort of hidden in `math/base', as
> `random-natural', `random-integer' and `random-bits'. The last is the
> fastest; use it if you want a natural number less than some power of 2.
>
> The distributions for those are uniform, so to get your wishes, you'll have
> to be a bit tricky. Here's one way: add them. If you add some number p of
> uniformly distributed numbers, as p increases, the distribution of the sum
> approaches a bell shape, which sounds like what you want. For example,
> adding two results in a triangular distribution:
>
> #lang racket
>
> (require plot)
>
> (define is (build-list 20000 (λ (_) (random 100))))
> (define js (build-list 20000 (λ (_) (random 100))))
> (plot (density (map + is js)))
>
>
> Here's a more general function that chooses large natural numbers < n by
> adding some number of uniformly random naturals:
>
> (require math)
>
> (define random-natural-parts 3)
>
> (define (my-random-natural n)
> (define p (min n random-natural-parts))
> (define m (quotient n p))
> (define m-last (- n (* m (- p 1))))
> (+ (apply + (build-list (- p 1) (λ (_) (random-natural (+ m 1)))))
> (random-natural m-last)))
>
> (define xs
> (build-list 20000 (λ (_) (my-random-natural
> 1512234623459871234987592388613))))
> (plot (density xs))
> (printf "max xs = ~v~n" (apply max xs))
> (printf "min xs = ~v~n" (apply min xs))
>
>
> It should work for all n >= 1. Use `random-natural-parts' to control the
> shape: 1 for uniform, 2 for triangular, 3 for piecewise quadratic, etc.,
> etc. The higher this number is, the less probable the numbers near 0 and n
> will be.
>
> --
>
> To probabilistically avoid some kinds of numbers, you could use rejection
> sampling. If you wanted all powers of two to be 1/4 the probability they
> normally would be, you could accept them with probability 1/4 like this:
>
> (define (my-random-natural* n)
> (define x (my-random-natural n))
> (cond [(or (not (power-of-two? x)) ((random) . < . 1/4)) x]
> [else (my-random-natural* n)]))
>
> ;; Plot a histogram
> (define-values (ys cs)
> (let*-values
> ([(ys) (build-list 20000 (λ (_) (my-random-natural* 40)))]
> [(ys cs) (count-samples ys)])
> (sort-samples < ys cs)))
>
> (plot (discrete-histogram (map list ys cs)))
>
>
> Here, "rejection" means trying again, since you always want an answer.
>
> To avoid numbers *near* a power of two, make the acceptance probability a
> function of the distance to the nearest power of two. We'll need a "nearest
> power of two" function first:
>
> (define (floor-log2 x)
> (- (integer-length x) 1))
>
> (define (ceiling-log2 x)
> (integer-length (- x 1)))
>
> (define (nearest-pow2 x)
> (cond [(zero? x) 1]
> [else
> (define x0 (expt 2 (floor-log2 x)))
> (define x1 (expt 2 (ceiling-log2 x)))
> (if ((abs (- x x0)) . < . (abs (- x x1))) x0 x1)]))
>
>
> Then make and use a function that returns acceptance probabilities:
>
> (define pow2-prob 1/4)
> (define pow2-spread 4)
>
> (define (accept-prob x)
> (define pow2 (nearest-pow2 x))
> (define d (min 1 (/ (abs (- x pow2)) pow2-spread)))
> ;; A convex combination of 1 and pow2-prob, with fraction d
> (+ d (* (- 1 d) pow2-prob)))
>
> (define (my-random-natural* n)
> (define x (my-random-natural n))
> (cond [((random) . < . (accept-prob x)) x]
> [else (my-random-natural* n)]))
>
>
> Here, `pow2-prob' is the probability a power of two is accepted, and
> `pow2-spread' is the minimum distance an x can be from a power of two to
> always be accepted.
>
>> (map accept-prob '(16 17 18 19 20 21))
> '(1/4 7/16 5/8 13/16 1 1)
>
> The `accept-prob' function can be anything, as long as it returns
> probabilities, and doesn't return small probabilities so often that
> `my-random-natural*' spends a long time looping.
>
> Neil ⊥
>
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users