[racket] Looking for feedback on code style
Will M. Farr wrote:
> Looks like Phil beat me to it, but here's some code that finds the
> n-th element of a list in O(N) time. The algorithm is similar to
> quicksort, but you don't sort both the sub-lists: partition the list
> into elements less than and greater or equal to a pivot. By counting
> the number of elements less than the pivot, you know whether the one
> you're looking for will be found in that set, or in the greater or
> equal set. Repeat the search in the appropriate set. The partition
> takes O(N) time. On average, it cuts the number of elements searched
> in half each time, so we have
>
> total time = N + N/2 + N/4 + ... = 2*N as N --> infty
Not quite. Consider the case of the median. If the pivot is the smallest
element, the recursion is on N-1 elements. If the pivot is the second
smallest, on N-2; and so on. If the pivot is the median, you can stop
(not all provided code does this, but it doesn't matter much for the
analysis). But the situation is symmetric on the other side: if the
pivot is the largest element, the recursion is on N-1 elements, and so
on down. So the expected number of elements the recursion is performed
on is about 3N/4.
And then what? Are we justified in saying T(N) = T(3N/4) + O(N)? Really,
we have T(N) = T(X) + O(N), where X is a random variable uniformly
distributed over [0..N-1]. With a suitable induction hypothesis, we can
dig ourselves out of this, but it's not simple, unfortunately. --PR