[racket] generator performance

From: Greg Hendershott (greghendershott at gmail.com)
Date: Tue Sep 18 12:52:22 EDT 2012

> I agree with others that the cost of capturing continuations is the
> culprit in this case. The run-time system has support for faster
> continuations that work in constrained settings (currently used to
> implement futures), and it might be possible to make those
> continuations kick in work for a typical generator, but I'm not sure.

IIUC continuations are also the foundation for Racket threads.

Which got me wondering. The following toy/naive implementation of a
generator using threads, seems about 4 or 5 times faster than the real
one.

However a simple producer function blows the doors off both.

#lang racket

(define N 100000)

(require racket/generator)
(define real-generator
  (generator
   ()
   (for ([i (in-range N)])
     (yield i))))

;; TO-DO: macro-ize.
(require racket/async-channel)
(define thread-generator
  (let* ([ch (make-channel)]
         [state 'fresh]
         [yield (lambda (x) (channel-put ch x))])
    (lambda ()
      (cond [(eq? state 'fresh)
             (set! state 'running)
             (thread
              (lambda ()
                ;; --> User code here
                (for ([i (in-range N)])
                  (yield i))
                ;; <-- User code here
                (set! state 'expired)
                (channel-put ch (void))))])
      (cond [(eq? state 'expired) (void)]
            [else (channel-get ch)]))))

(define simple-producer
  (let ([n 0])
  (lambda ()
    (cond [(< n N) (begin0 n (set! n (add1 n)))]
          [else (void)]))))

(time
 (length (for/list ([x (in-producer real-generator void?)])
           x)))
=> cpu time: 1775 real time: 1813 gc time: 236
=> 100000

(time
 (length (for/list ([x (in-producer thread-generator void?)])
           x)))
=> cpu time: 402 real time: 402 gc time: 6
=> 100000

(time
 (length (for/list ([x (in-producer simple-producer  void?)])
           x)))
=> cpu time: 7 real time: 8 gc time: 4
=> 100000

I'm sure this is nothing you didn't already realize. It was just
interesting for me to try.

> Meanwhile, it might be interesting to try implementing a `generator'
> form that rewrites its body to implement `yield's that are
> syntactically apparent (after local-expansion), leaving parameter and
> continuations to handle the general case.

That went waaaay over my head. :)

On Tue, Sep 18, 2012 at 8:28 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> I don't think the parameter overhead is relevant. In Jay's post, he
> performs 100k parameter accesses, and it takes 16 milliseconds; we can
> reasonably extrapolate 160 msec for 1M accesses. Patrick's generator
> example yields 1M times, and takes 18 seconds --- 100x the parameter
> overhead.
>
> I agree with others that the cost of capturing continuations is the
> culprit in this case. The run-time system has support for faster
> continuations that work in constrained settings (currently used to
> implement futures), and it might be possible to make those
> continuations kick in work for a typical generator, but I'm not sure.
>
> Meanwhile, it might be interesting to try implementing a `generator'
> form that rewrites its body to implement `yield's that are
> syntactically apparent (after local-expansion), leaving parameter and
> continuations to handle the general case.
>
> At Tue, 18 Sep 2012 07:07:51 -0400, Greg Hendershott wrote:
>> This stuff is still over my head; I'm at that "know just enough to be
>> dangerous" stage. With that caveat:
>>
>> Jay's blog post
>> http://jeapostrophe.github.com/blog/2012/07/25/cont-marks2/ mentions
>> that parameters can be >10X slower than dynamic-wind.
>>
>> In generator.rkt I notice an early version that didn't support
>> multiple values. The new version does, keeping a "yielder" proc in a
>> parameter.
>>
>> Could a single-valued generator yield be faster enough?  Or am I
>> misapplying my misunderstanding? :)
>>
>> On Sun, Sep 16, 2012 at 11:15 AM, Eli Barzilay <eli at barzilay.org> wrote:
>> > Two hours ago, Patrick Useldinger wrote:
>> >> Hello
>> >>
>> >> I have a Python program doing some intensive computing and would
>> >> like to port it to Racket for performance reasons.
>> >>
>> >> I use a generator in Python which has a very low overhead. While
>> >> writing some test programs, I seem to have an substantial overhead
>> >> on using generators in Racket:
>> >> [...]
>> >
>> > IIRC, Python optimizes generators in some ways when possible so when
>> > they're used with loops there's no big overhead.  The racket
>> > generators are implemented with continuations and the cost that comes
>> > with that is unfortunately pretty high.  If you need performance, your
>> > best bet is probably to translate things to `for...' loops if
>> > possible.
>> >
>> > --
>> >           ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>> >                     http://barzilay.org/                   Maze is Life!
>> > ____________________
>> >   Racket Users list:
>> >   http://lists.racket-lang.org/users
>> ____________________
>>   Racket Users list:
>>   http://lists.racket-lang.org/users

Posted on the users mailing list.