[racket-dev] racket/stream
At Fri, 25 Mar 2011 09:50:03 -0400, Eli Barzilay wrote:
> > > > I'm not convinced that `racket/sequence' is useless. Here's a
> > > > list of the current functions (with the current names):
> >
> > I threw out `sequence-cons', `sequence-first', and `sequence-rest',
> > and I kept the rest. The various functions are all unnecessary on
> > some level, just like `map', `for-each', and `compose' are
> > unnecessary on some level But the function forms and/or alternate
> > names look handy for some contexts.
>
> I think that this is redundatly complicating things especially when
> dealing with functions that *produce* sequences. Assuming the
> interface is completely uniform, then for each `foo' we have `foo',
> `stream-foo', `in-foo' -- and the extra one is `sequence-foo'.
There's a naming-convention clash between
* `in-...' for sequences (sometimes streams) especially intended for
direct use in `for' forms, and
* operations on streams and sequences, which would normally start
`stream-' or `sequence-'.
I see no problem with having both `in-sequences' and `sequence-append',
for example, if it makes functionality easier to find.
> The
> reason I don't like it is that the `sequence-foo' function simply
> converts non-stream inputs ("producers") to streams and then use the
> `stream-foo' on that. As a nice point that supports it, the first
> thing I looked at was `sequence-tail' and I saw a documentation bug:
>
> In case extracting elements from `s' involves a side effect, they
> will not be extracted until the first element is extracted from the
> resulting sequence.
>
> But the code looked like it does the side effects as needed since it's
> coerced into a lazy stream. (BTW, also the first "extracted" should
> be "executed".)
Ok, I see that sequence operations that produce sequences shouldn't be
implemented via streams: For consistency, accessing an element from a
stream produced by `stream-map' should have the same sort of side
effects (if any) as accessing the original sequence. Converting to a
stream creates different behavior, since results are cached to ensure
stream-like behavior.
I've fixed the implementation of `sequence-tail', `sequence-map',
`sequence-filter', and `sequence-add-between'. The repair involved
adding `sequence-generate*', which starts a sequence in a way that is
consistent with using state but doesn't use state if the sequence
itself isn't stateful.
> So how about doing this:
>
> * `empty-sequence', `sequence-map', `sequence-tail', `sequence-filter'
> Turned to `in-empty', `in-map', `in-tail', `in-filter', and moved
> with the others in `racket/private/for'
I can imagine adding some of those `in-' variants eventually, but I
think we still want the `sequence-' names anyway. I haven't added any
`in-' variants, so far.
> * `sequence-add-between'
> Should turn to `in-add-between', but that seems like an arbitrary
> function from `racket/list' that is used in the sequence interface.
Left in, on the grounds of history and that it seems plausibly useful.
> * `sequence->list', `sequence-length', `sequence-ref',
> `sequence-andmap', `sequence-ormap', `sequence-for-each',
> `sequence-fold', `sequence-count'
> Left as is, in `racket/sequence'. Possibly define them as macros
> to avoid performance surprises.
Left as-is.
> * `sequence-append'
> Removed, possibly adding `in-append' as an alias for
> `in-sequences'
Left as-is, on the grounds that `sequence-' names are reasonable and
potentially helpful.
> > The new `racket/stream' library provides stream versions of the
> > functions, plus `stream-cons' (based on SRFI-41), `stream-empty?',
> > `stream-first', and `stream-rest'.
>
> Is there any reason that you've used the srfi code as-is?
I considered replacing it, but I left it alone because improving the
implementation of lazy streams wasn't my immediate goal. Please feel
free to improve it.
> > Meanwhile, `in-range' now produces a stream, the `for' variants support
> > streams in a slightly more direct way, etc.
>
> For simple things like ranges I wanted to make it possible to
> represent streams without caching values etc. Again, if it uses the
> usual promises, then it can just use `delay/name' (which has a
> minimal-ish cost over thunks, just the indirections when dispatching
> on a struct property and accessing its field).
>
> But actually for ranges I prefer using a state-based produce in some
> form, since it's much faster.
I see how a state-based range can be faster if everything is
state-based, but I see no performance benefit for a range producer that
plugs into a more general stream/sequence protocol. Meanwhile, using
state has the obvious drawbacks (e.g., interacts badly with
continuations). Note that in revising `racket/sequence', I removed all
state (i.e., no state in any of the functions, except for any state
within a given sequence).