[racket] Randomized bisimulation tests and Metropolis-Hastings random walk

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Fri Jan 4 17:06:33 EST 2013

Very cool! I've run into this problem a few times.

And here's an example where it happens right now ....

http://drdr.racket-lang.org/26021/collects/tests/future/random-future.rkt

Robby


On Fri, Jan 4, 2013 at 3:45 PM, Neil Toronto <neil.toronto at gmail.com> wrote:

> I get excited about applying statistics to programming. Here's something
> exciting I found today.
>
> I'm working on a Typed Racket implementation of Chris Okasaki's purely
> functional random-access lists, which are O(1) for `cons', `first' and
> `rest', and basically O(log(n)) for random access. I wanted solid
> randomized tests, and I figured the best way would be bisimulation: do
> exactly the same, random thing to a (Listof Integer) and an (RAList
> Integer), then ensure the lists are the same. Iterate N times, each time
> using the altered lists for the next bisimulation step.
>
> There's a problem with this, though: the test runtimes are all over the
> place for any fixed N, and are especially wild with large N. Sometimes it
> first does a bunch of `cons' operations in a row. Because each bisimulation
> step is O(n) in the list length (to check equality), this makes the
> remaining steps very slow. Sometimes it follows up with a bunch of `rest'
> operations, which makes the remaining steps very fast.
>
> More precisely, random bisimulation does a "simple random walk" over test
> list lengths, where `cons' is a step from n to n+1 and `rest' is a step
> from n to n-1. Unfortunately, the n's stepped on in a simple random walk
> have no fixed probability distribution, and thus no fixed average or
> variance. That means each bisimulation step has no fixed average runtime or
> fixed variation in runtime. Wildness ensues.
>
> One possible solution is to generate fresh test lists from a fixed
> distribution, for each bisimulation step. This is a bad solution, though,
> because I want to see whether RAList instances behave correctly *over time*
> as they're operated on.
>
> The right solution is to use a Metropolis-Hastings random walk instead of
> a simple random walk, to sample list lengths from a fixed distribution.
> Then, the list lengths will have a fixed average, meaning that each
> bisimulation step will have a fixed average runtime, and my overall average
> test runtime will be O(N) instead of wild and crazy.
>
> First, I choose a distribution. The geometric family is good because list
> lengths are nonnegative. Geometric with p = 0.05 means list lengths should
> be (1-p)/p = 19 on average, but are empty with probability 0.05 and very
> large with very low probability.
>
> So I add these two lines:
>
>   (require math/distributions)
>
>   (define length-dist (geometric-dist 0.05))
>
> In a Metropolis-Hastings random walk, you step from n to n' with
> probability min(1,P(n')/P(n)); this is called "accepting" the step.
> Otherwise, you "reject" the step by staying in the same spot. Adding this
> new test takes only two defines and a `cond':
>
>   (define r (random 5))  ; Randomly choose an operation
>   (cond
>     [(= r 0)
>      ;; New! Only cons with probability P(n+1)/P(n)
>      (define n (length lst))
>      (define accept (/ (pdf length-dist (+ n 1))
>                        (pdf length-dist n)))
>      (cond [((random) . < . accept)
>             .... Accept: cons and ensure continued equality ....]
>            [else
>             .... Reject: return the same lists ....])]
>     [(= r 1)
>      .... Accept: `rest' each list and ensure continued equality ....]
>     .... Other random operations ....)
>
> I left off the acceptance test for `rest' because P(n-1) > P(n) ==>
> P(n-1)/P(n) > 1, so the test would always pass. (This is because the
> geometric distribution's pdf is monotone. If I had chosen a Poisson
> distribution, which has a bump in the middle, I'd have to do the acceptance
> test for both `cons' and `rest'.)
>
> I instrument the test loop to record list lengths, verify that their mean
> is about 19, and do some density plots vs. the geometric distribution. It
> all looks good: the randomized bisimulation test does indeed operate on
> lists whose length is distributed Geometric(0.05), so the overall average
> test runtime is O(N), and it still tests them as they're operated on over
> time.
>
> Neil ⊥
> ____________________
>  Racket Users list:
>  http://lists.racket-lang.org/**users <http://lists.racket-lang.org/users>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20130104/eb8f2ad1/attachment.html>

Posted on the users mailing list.