[plt-scheme] Unreasonably slow

From: Jos Koot (jos.koot at telefonica.net)
Date: Wed Oct 29 15:23:53 EDT 2008

The problem is that srfi 41 requires streams to be a distinct data type.
When you allow streams just to be promises (as in scheme/promise), things 
run much faster. I have a non conformant implementation of srfi 41 that does 
not make the distinction and therefore is over two times faster.
On this list I already posted an idea how to allow a distinct data type for 
streams without loosing on performance, but the post did not (yet) come 
through. May be to big. The problem of the srfi 41 document and its standard 
implementation is that:
1: it wants stream-null to be a unique object (which is not possible)
2: it wants streams to be a distinct datatype (in particular distinct from 
the promises as automatically available in mzscheme and scheme/base)
As currently available in the PLT Scheme languages, the streams of srfi 41 
are a rather inefficient extra layer around a duplication of the 
implementation of srfi 45.
Jos

----- Original Message ----- 
From: "Eli Barzilay" <eli at barzilay.org>
To: "Matthias Felleisen" <matthias at ccs.neu.edu>
Cc: "Scheme PLT" <plt-scheme at list.cs.brown.edu>
Sent: Wednesday, October 29, 2008 7:31 PM
Subject: Re: [plt-scheme] Unreasonably slow


> I've had a semi-brief look at the problem.  First of all, the main
> reason I dislike srfi-41 is the extra syntactic layer that can get in
> the way of using it properly -- you're basically writing code with
> `stream-' prefixes where you want lazy lists, which can make the code
> too obscure to follow.  [This is very much like writing defmacro-style
> macros without quasiquotations.]  In the stream code that Alex wrote,
> you can sort of see that there is a suspicious point where he uses
>
>  (stream-append (stream prime) ...stuff..)
>
> instead of
>
>  (stream-cons prime ...stuff...)
>
> Jens did notice two places where this happened, and as an experienced
> (strict) Scheme hacker he immediately got rid of them.  But if you
> translate the code to Lazy Scheme, which in this case is mostly done
> by removing `stream-' prefixes, you get this (this is without any
> other modifications) as body of Alex's loop:
>
>      (let loop ([strm numbers] [primes null])
>        (define prime (car strm))
>        (append (list prime)
>                (loop (filter-stream prime (cdr strm))
>                      (append primes (list prime)))))
>
> Removing the `append's as Jens did:
>
>      (let loop ([strm numbers] [primes null])
>        (define prime (car strm))
>        (cons prime
>              (loop (filter-stream prime (cdr strm))
>                    (cons prime primes))))
>
> But now it's easy to see that `primes' is never used -- so you can get
> the same result with:
>
>      (let loop ([strm numbers])
>        (define prime (car strm))
>        (cons prime (loop (filter-stream prime (cdr strm)))))
>
> In any case, using Lazy Scheme was a little slower than using srfi-41
> (you get more laziness overhead because everything is lazy, not just
> the lists), and translating the resulting much smaller program back to
> using srfi-41 was running at roughly the same speed.  (On my machine
> this was around 125s for all of the srfi-41 programs, and 160s for
> lazy scheme programs.)
>
> I think that the main problem here is that mzscheme is not really
> tailored to optimize such code well enough -- and that's unlikely to
> improve much because it is not a lazy language.  It might be possible
> to find some optimizations at the lazy scheme level, but I don't know
> if this will ever get to a point where it can compete with strict
> code.  (Ignoring cases where you use laziness to avoid expensive
> computations implicitly.)
>
>
> On Oct 29, Matthias Felleisen wrote:
>>
>> Problem solved. Alex is using Python's itertools library, which
>> brought quasi-functional programming to Python -- in the form of a C
>> library. If you look at his code (forwarded w/o permission), you'll
>> see that the Python contribution is a while-true loop, the rest are
>> two or three calls to C functions. I suspect that an equivalent
>> formulation in PLT would perform at the same level.
>
> So then I tried an iterator-based approach.  I used explicit state to
> avoid even the cost of capturing continuations.  To get this I needed
> three simple "library" functions:
>
>  (define (count n)
>    (let ([n (sub1 n)])
>      (lambda () (set! n (add1 n)) n)))
>
>  (define ((filter pred iter))
>    (let loop ([x (iter)])
>      (if (pred x) x (loop (iter)))))
>
>  (define (iter-ref iter n)
>    (for ([i (in-range n)]) (iter))
>    (iter))
>
> and then I could translate Alex's Python code to this, uh,
> "framework":
>
>  (define primes
>    (let loop ([ps (count 2)])
>      (lambda ()
>        (let ([prime (ps)])
>          (set! ps (filter (lambda (x) (not (zero? (modulo x prime))))
>                           ps))
>          prime))))
>  (iter-ref primes 10000)
>
> The result was computed in 5s, about 25 times faster than using
> streams, which I think is roughly the same as the Python version,
> which confirms Matthias's reply.
>
> -- 
>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>                  http://www.barzilay.org/                 Maze is Life!
> _________________________________________________
>  For list-related administrative tasks:
>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
> 



Posted on the users mailing list.