[plt-scheme] Scheme implementation of Fisher-Yates shuffle

From: Prabhakar Ragde (plragde at uwaterloo.ca)
Date: Mon Aug 10 09:16:38 EDT 2009

Phil Bewig writes:

 > Joe Marshall provides this method of shuffling
 > a list by partitioning it into two pieces deterministically, 
shuffling them
 > recursively, then merging them randomly:

 > (define (shuffle xs)
 >   (if (or (null? xs) (null? (cdr xs))) xs
 >       (let split ((xs xs) (odd '()) (even '()))
 >         (if (pair? xs)
 >             (split (cdr xs) (cons (car xs) even) odd)
 >             (let merge ((odd (shuffle odd)) (even (shuffle even)))
 >               (cond ((null? odd) even)
 >                     ((null? even) odd)
 >                     ((zero? (random 2)) (cons (car odd) (merge (cdr 
odd) even)))
 >                     (else (cons (car even) (merge odd (cdr even))))))))))

This code will not give a distribution where all permutations of the
list are equally likely. The easiest way to see this is to consider
the input '(0 1 2). 1 will be in the odd list by itself, and it has
probability 1/2 of being first in the result, instead of the required
1/3.

 >  Al Petrofsky proposes this somewhat faster code that first 
partitions the
 > list randomly, then randomly merges them:

 > (define (shuffle xs)
 >   (let shuffle ((xs xs) (acc '()))
 >     (if (null? xs) acc
 >         (if (null? (cdr xs)) (cons (car xs) acc)
 >             (let split ((xs xs) (x1 '()) (x2 '()))
 >               (if (null? xs)
 >                   (if (null? x1)
 >                       (split x2 '() '())
 >                       (shuffle x1 (shuffle x2 acc)))
 >                   (if (zero? (random 2))
 >                       (split (cdr xs) (cons (car xs) x1) x2)
 >                       (split (cdr xs) x1 (cons (car xs) x2)))))))))

This does give the right distribution, but the running time is not
O(n log n) in the worst case. A similar analysis to that of quicksort on a
random permutation will show that it is O(n log n) on average (or with
high probability) but O(n^2) in the worst case.

Can we ensure the right distribution and O(n log n) worst-case running
time? Yes, by fixing Joe's code so that in a random merge, the
probability of taking from one of the two lists is proportional to the
length of that list.

(define (shuffle xs)
   (if (or (null? xs) (null? (cdr xs))) xs
       (let split ((xs xs) (odd '()) (even '()) (len 0))
         (if (pair? xs)
             (split (cdr xs) (cons (car xs) even) odd (add1 len))
             (let merge ((odd (shuffle odd)) (oddl (length odd)) (even 
(shuffle even)) (len len))
               (cond ((= len oddl) odd)
                     ((zero? oddl) even)
                     ((< (random len) oddl) (cons (car odd) (merge (cdr 
odd) (sub1 oddl) even (sub1 len))))
                     (else (cons (car even) (merge odd oddl (cdr even) 
(sub1 len))))))))))

If desired, we can incorporate the accumulator idea from Al's code. --PR


Posted on the users mailing list.