[racket-dev] racket/stream

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sun Mar 27 12:04:18 EDT 2011

At Fri, 25 Mar 2011 09:50:03 -0400, Eli Barzilay wrote:
> > > > I'm not convinced that `racket/sequence' is useless. Here's a
> > > > list of the current functions (with the current names):
> > 
> > I threw out `sequence-cons', `sequence-first', and `sequence-rest',
> > and I kept the rest. The various functions are all unnecessary on
> > some level, just like `map', `for-each', and `compose' are
> > unnecessary on some level But the function forms and/or alternate
> > names look handy for some contexts.
> 
> I think that this is redundatly complicating things especially when
> dealing with functions that *produce* sequences.  Assuming the
> interface is completely uniform, then for each `foo' we have `foo',
> `stream-foo', `in-foo' -- and the extra one is `sequence-foo'. 

There's a naming-convention clash between

 * `in-...' for sequences (sometimes streams) especially intended for
   direct use in `for' forms, and

 * operations on streams and sequences, which would normally start
   `stream-' or `sequence-'.

I see no problem with having both `in-sequences' and `sequence-append',
for example, if it makes functionality easier to find.

> The
> reason I don't like it is that the `sequence-foo' function simply
> converts non-stream inputs ("producers") to streams and then use the
> `stream-foo' on that.  As a nice point that supports it, the first
> thing I looked at was `sequence-tail' and I saw a documentation bug:
> 
>   In case extracting elements from `s' involves a side effect, they
>   will not be extracted until the first element is extracted from the
>   resulting sequence.
> 
> But the code looked like it does the side effects as needed since it's
> coerced into a lazy stream.  (BTW, also the first "extracted" should
> be "executed".)

Ok, I see that sequence operations that produce sequences shouldn't be
implemented via streams: For consistency, accessing an element from a
stream produced by `stream-map' should have the same sort of side
effects (if any) as accessing the original sequence. Converting to a
stream creates different behavior, since results are cached to ensure
stream-like behavior.

I've fixed the implementation of `sequence-tail', `sequence-map',
`sequence-filter', and `sequence-add-between'. The repair involved
adding `sequence-generate*', which starts a sequence in a way that is
consistent with using state but doesn't use state if the sequence
itself isn't stateful.

> So how about doing this:
> 
> * `empty-sequence', `sequence-map', `sequence-tail', `sequence-filter'
>     Turned to `in-empty', `in-map', `in-tail', `in-filter', and moved
>     with the others in `racket/private/for'

I can imagine adding some of those `in-' variants eventually, but I
think we still want the `sequence-' names anyway. I haven't added any
`in-' variants, so far.

> * `sequence-add-between'
>     Should turn to `in-add-between', but that seems like an arbitrary
>     function from `racket/list' that is used in the sequence interface.

Left in, on the grounds of history and that it seems plausibly useful.

> * `sequence->list', `sequence-length', `sequence-ref',
>   `sequence-andmap', `sequence-ormap', `sequence-for-each',
>   `sequence-fold', `sequence-count'
>     Left as is, in `racket/sequence'.  Possibly define them as macros
>     to avoid performance surprises.

Left as-is.

> * `sequence-append'
>     Removed, possibly adding `in-append' as an alias for
>     `in-sequences'

Left as-is, on the grounds that `sequence-' names are reasonable and
potentially helpful.

> > The new `racket/stream' library provides stream versions of the
> > functions, plus `stream-cons' (based on SRFI-41), `stream-empty?',
> > `stream-first', and `stream-rest'.
> 
> Is there any reason that you've used the srfi code as-is? 

I considered replacing it, but I left it alone because improving the
implementation of lazy streams wasn't my immediate goal. Please feel
free to improve it.

> > Meanwhile, `in-range' now produces a stream, the `for' variants support
> > streams in a slightly more direct way, etc.
> 
> For simple things like ranges I wanted to make it possible to
> represent streams without caching values etc.  Again, if it uses the
> usual promises, then it can just use `delay/name' (which has a
> minimal-ish cost over thunks, just the indirections when dispatching
> on a struct property and accessing its field).
> 
> But actually for ranges I prefer using a state-based produce in some
> form, since it's much faster.

I see how a state-based range can be faster if everything is
state-based, but I see no performance benefit for a range producer that
plugs into a more general stream/sequence protocol. Meanwhile, using
state has the obvious drawbacks (e.g., interacts badly with
continuations). Note that in revising `racket/sequence', I removed all
state (i.e., no state in any of the functions, except for any state
within a given sequence).



Posted on the dev mailing list.