[plt-scheme] Unreasonably slow

From: Eli Barzilay (eli at barzilay.org)
Date: Wed Oct 29 14:31:21 EDT 2008

I've had a semi-brief look at the problem.  First of all, the main
reason I dislike srfi-41 is the extra syntactic layer that can get in
the way of using it properly -- you're basically writing code with
`stream-' prefixes where you want lazy lists, which can make the code
too obscure to follow.  [This is very much like writing defmacro-style
macros without quasiquotations.]  In the stream code that Alex wrote,
you can sort of see that there is a suspicious point where he uses

  (stream-append (stream prime) ...stuff..)

instead of

  (stream-cons prime ...stuff...)

Jens did notice two places where this happened, and as an experienced
(strict) Scheme hacker he immediately got rid of them.  But if you
translate the code to Lazy Scheme, which in this case is mostly done
by removing `stream-' prefixes, you get this (this is without any
other modifications) as body of Alex's loop:

      (let loop ([strm numbers] [primes null])
        (define prime (car strm))
        (append (list prime)
                (loop (filter-stream prime (cdr strm))
                      (append primes (list prime)))))

Removing the `append's as Jens did:

      (let loop ([strm numbers] [primes null])
        (define prime (car strm))
        (cons prime
              (loop (filter-stream prime (cdr strm))
                    (cons prime primes))))

But now it's easy to see that `primes' is never used -- so you can get
the same result with:

      (let loop ([strm numbers])
        (define prime (car strm))
        (cons prime (loop (filter-stream prime (cdr strm)))))

In any case, using Lazy Scheme was a little slower than using srfi-41
(you get more laziness overhead because everything is lazy, not just
the lists), and translating the resulting much smaller program back to
using srfi-41 was running at roughly the same speed.  (On my machine
this was around 125s for all of the srfi-41 programs, and 160s for
lazy scheme programs.)

I think that the main problem here is that mzscheme is not really
tailored to optimize such code well enough -- and that's unlikely to
improve much because it is not a lazy language.  It might be possible
to find some optimizations at the lazy scheme level, but I don't know
if this will ever get to a point where it can compete with strict
code.  (Ignoring cases where you use laziness to avoid expensive
computations implicitly.)

On Oct 29, Matthias Felleisen wrote:
> Problem solved. Alex is using Python's itertools library, which
> brought quasi-functional programming to Python -- in the form of a C
> library. If you look at his code (forwarded w/o permission), you'll
> see that the Python contribution is a while-true loop, the rest are
> two or three calls to C functions. I suspect that an equivalent
> formulation in PLT would perform at the same level.

So then I tried an iterator-based approach.  I used explicit state to
avoid even the cost of capturing continuations.  To get this I needed
three simple "library" functions:

  (define (count n)
    (let ([n (sub1 n)])
      (lambda () (set! n (add1 n)) n)))
  (define ((filter pred iter))
    (let loop ([x (iter)])
      (if (pred x) x (loop (iter)))))
  (define (iter-ref iter n)
    (for ([i (in-range n)]) (iter))

and then I could translate Alex's Python code to this, uh,

  (define primes
    (let loop ([ps (count 2)])
      (lambda ()
        (let ([prime (ps)])
          (set! ps (filter (lambda (x) (not (zero? (modulo x prime))))
  (iter-ref primes 10000)

The result was computed in 5s, about 25 times faster than using
streams, which I think is roughly the same as the Python version,
which confirms Matthias's reply.

          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!

Posted on the users mailing list.