[racket-dev] racket/stream

From: Eli Barzilay (eli at barzilay.org)
Date: Thu Mar 17 16:13:26 EDT 2011

Yesterday, Matthew Flatt wrote:
> At Wed, 16 Mar 2011 13:21:28 -0400, Eli Barzilay wrote:
> > >    Generators include the result of `generator' and input
> > >    ports. (The `generator?' predicate currently returns false
> > >    for input ports, but that could change.)
> > 
> > I'd rather see these generators be only thunks (explained below)
> > -- so all thunks can be used as generators.  Possibly, "generator"
> > becomes just a name for a thunk used to retreive values, usually
> > based on some state -- but since this is confusing, I'll use
> > "producer" here.
> I see your point about "generator", and so I agree that we should
> use a different term. I'm not too happy with the term "producer",
> but I don't yet have a better suggestion.

(I don't have a strong opinion about the name, I just prefer it to be
distinct from `generator'.)

> > Using producer thunks looks to me better than generators (in the
> > `racket/generator' sense) because some of them are not things that you
> > could iterate over straightforwardly -- for example,
> > 
> >   (generator () (yield (yield 1)))
> > 
> > will expect a value on the second call (or with an input value, after
> > your commit today, the first will require a value too).  It also looks
> > to me better than an input port, which naturally resolves the question
> > of how to read from the port -- you just turn it into a producer with
> > 
> >   (lambda () (some-read in))
> I don't think that 0-arity procedures should be the only
> implementation of producers. For example, it could be useful for the
> result of `in-producer' to identify itself as a stateful sequence.

I like the lack of such identification -- it makes it easy to just use
something like (lambda () 1) as the producer equivalent of an infinite
stream of 1s.

> > > Based on the above terminology, here's the programming part of the
> > > proposal:
> > > 
> > >  1. Rename `racket/stream' and its bindings back to "sequence".
> > > 
> > >  2. Introduce a `racket/stream' library that provides stream
> > >     versions (typically lazy) of the current operations, including
> > >     the expected lazy `stream-cons' form.
> > 
> > So what we talked about is similar, except that #2 replaces
> > `racket/stream' in place.  `racket/sequence' doesn't happen since
> > there's no point in that (and AFAICT, people really want the stream
> > operations, not the sequence ones).
> I'm not convinced that `racket/sequence' is useless. Here's a list of
> the current functions (with the current names):

OK, I'll go over this list, but change "stream" to "sequence" to make
it more concrete:

>  empty-sequence

I don't see any need for that -- you can use any of '(), "", `null',
etc instead.  (It would be needed for an implementation of a new type,
but one of the points of sequences is to overload the functionality
for different input types.)

>  sequence-append

That's identical in its functionality to `in-sequences' when used as a
value only.

>  sequence-map
>  sequence-filter

I'd like to see these as an `in-map' and `in-filter' instead.  (Which
means that they should be implemented more efficiently as macros too.)

>  sequence-for-each

I don't see any point in this over writing an explicit `for'.

>  sequence-fold

This might be fine, perhaps the syntax of `for/fold' is hard to
remember that it will help.

>  sequence->list
>  sequence-length
>  sequence-andmap
>  sequence-ormap
>  sequence-count

These are all trivial uses of for loops.

>  sequence-first
>  sequence-ref

I don't like these -- they encourage an illusion that it's easy to get
to some random item in the sequence.  What I suggest instead of these
-- `producer-first' and `producer-ref' (or whatever name is used) --
is much better in clarifying what happens.  For example, it's not
clear enough how

  (define foo (in-lines some-port))
  (sequence-first foo)

works in terms of side effects, but with a generator it's clearly
expected that `foo' changes.  The same goes for your suggestion for
special casing `sequence-append' when all of its inputs are streams.
I strongly believe that this is a decisions that should be made
explicitly, not be left for some implicit guess.

>  sequence-rest
>  sequence-tail
>  sequence-cons

These are even worse -- besides the implicit answer to what happens
with state, it's dangerous as an opration that seems reasonable to
use.  I don't think that it should be added in a similar way that we
don't have a `snoc' for lists or a `vector-rest' -- but this would be
even more dangerous in that it hides the costs more, since there are
diferent sequence types.

>  sequence-add-between

This is very redundant -- unless the goal is to make this into some
library that insists on generalizes all ordered things as something
that has a list-like interface.

          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

Posted on the dev mailing list.