[racket-dev] racket/stream

From: Eli Barzilay (eli at barzilay.org)
Date: Fri Mar 25 09:50:03 EDT 2011

A week ago, Matthew Flatt wrote:
> At Thu, 17 Mar 2011 16:13:26 -0400, Eli Barzilay wrote:
> > Yesterday, Matthew Flatt wrote:
> > > I see your point about "generator", and so I agree that we
> > > should use a different term. I'm not too happy with the term
> > > "producer", but I don't yet have a better suggestion.
> 
> For now, I've left the generator and producer support as it was ---
> no particular emphasis or support for stateful sequences, and so no
> particular need for terminology.

OK.  The main point I saw in them is that state-based `in-range'
producers were by far the fastest.


> > > I don't think that 0-arity procedures should be the only
> > > implementation of producers. [...]
> > 
> > I like the lack of such identification -- it makes it easy to just
> > use something like (lambda () 1) as the producer equivalent of an
> > infinite stream of 1s.
> 
> I never meant to suggest that a thunk couldn't be a producer ---
> only that producers might have representations other than
> thunks. Again, it's moot for now.

(I just didn't see a point in that if it implies some new struct that
holds a port and a reader function, when doing it as a thunk avoids
that complication.)


> > > I'm not convinced that `racket/sequence' is useless. Here's a
> > > list of the current functions (with the current names):
> 
> I threw out `sequence-cons', `sequence-first', and `sequence-rest',
> and I kept the rest. The various functions are all unnecessary on
> some level, just like `map', `for-each', and `compose' are
> unnecessary on some level But the function forms and/or alternate
> names look handy for some contexts.

I think that this is redundatly complicating things especially when
dealing with functions that *produce* sequences.  Assuming the
interface is completely uniform, then for each `foo' we have `foo',
`stream-foo', `in-foo' -- and the extra one is `sequence-foo'.  The
reason I don't like it is that the `sequence-foo' function simply
converts non-stream inputs ("producers") to streams and then use the
`stream-foo' on that.  As a nice point that supports it, the first
thing I looked at was `sequence-tail' and I saw a documentation bug:

  In case extracting elements from `s' involves a side effect, they
  will not be extracted until the first element is extracted from the
  resulting sequence.

But the code looked like it does the side effects as needed since it's
coerced into a lazy stream.  (BTW, also the first "extracted" should
be "executed".)

So how about doing this:

* `empty-sequence', `sequence-map', `sequence-tail', `sequence-filter'
    Turned to `in-empty', `in-map', `in-tail', `in-filter', and moved
    with the others in `racket/private/for'

* `sequence-add-between'
    Should turn to `in-add-between', but that seems like an arbitrary
    function from `racket/list' that is used in the sequence interface.

* `sequence->list', `sequence-length', `sequence-ref',
  `sequence-andmap', `sequence-ormap', `sequence-for-each',
  `sequence-fold', `sequence-count'
    Left as is, in `racket/sequence'.  Possibly define them as macros
    to avoid performance surprises.

* `sequence-append'
    Removed, possibly adding `in-append' as an alias for
    `in-sequences'


> The new `racket/stream' library provides stream versions of the
> functions, plus `stream-cons' (based on SRFI-41), `stream-empty?',
> `stream-first', and `stream-rest'.

Is there any reason that you've used the srfi code as-is?  It uses its
own `lazy' implementation that is incompatible with the racket one.
IIRC, Phil insists on that so streams are a completely disjoint new
type.  But OTOH, we have a strong point to using plain `lazy'-based
streams: it will allow sending streams to lazy racket where they can
be used as lists, and it will allow constructing lists in lazy racket
and using them as streams.

In addition, the racket `lazy' is much more careful about dealing with
exceptions.  This does imply a higher cost, but that's a separate
problem.


> Meanwhile, `in-range' now produces a stream, the `for' variants support
> streams in a slightly more direct way, etc.

For simple things like ranges I wanted to make it possible to
represent streams without caching values etc.  Again, if it uses the
usual promises, then it can just use `delay/name' (which has a
minimal-ish cost over thunks, just the indirections when dispatching
on a struct property and accessing its field).

But actually for ranges I prefer using a state-based produce in some
form, since it's much faster.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!


Posted on the dev mailing list.