[plt-dev] scheme/set

From: Carl Eastlund (cce at ccs.neu.edu)
Date: Tue Feb 23 18:22:39 EST 2010

On Tue, Feb 23, 2010 at 3:20 PM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
>
> I think Java uses the `make-hash-set' strategy a lot, and I'm pretty
> sure the following has happened to me on more than one occasion:
>
>  * I've wanted a Foo.
>
>  * I discover that Foo is an interface with specific implementations
>   XFoo, YFoo, and ZFoo.
>
>  * It turns out that any of X, Y, or Z would work for my simple
>   purposes, but I have to learn about them and pick one.
>
>  * I'd really rather just say `new Foo' and let someone else pick a
>   good default implementation for me.
>
> In contrast to Java, nothing prevents us from having `make-set' (the
> equivalent of `new Foo') while later making `set?' an interface-like
> predicate, should we eventually decide that the generality is
> worthwhile.
>
> Along similar lines, we could embed "equal" in the name `make-set', but
> `equal?' is such a good default for the equivalence relation that it's
> nicer in practice to just call it `make-set'.

In retrospect, I agree with this.  One way or another, we should have
a make-set, and there's no way to do that without having a default
implementation.  Objection retracted.

> Does anyone use dictionaries other than hash tables? So far,
> `scheme/dict' doesn't seem to be worth even the day or two I spent on
> it.

I think it's absolutely worth it.  First of all, as we've seen in this
thread, several people have other task-specific dictionaries.  Second
of all, it's a good building block for adding other general
dictionaries.  Sure, we don't have them now; scheme/dict would
probably get more use if we did.  But if we didn't have scheme/dict,
it wouldn't be as appealing to add new dictionary types.  There's a
chicken-and-egg problem here, and I'm glad you picked one side of the
equation and just went with it.

> The sequence generalization, in contrast, has definitely paid off. I
> think it's because we so often need to combine different kinds of
> sequences, such as iterating over a list and natural numbers in
> parallel; otherwise, we'd just use `map', `vector-map', etc. I don't
> see lots of applications that need to deal with two different
> representations of sets at the same time, though.

Huh.  I have lots of representations of sets and dictionaries all the
time.  A lot of the time I wind up using a sequence abstraction
instead, because we didn't have sets until now, or coming up with a
domain-specific, impoverished set interface, but I use sets all over
the place.  And I have to convert between lists and sets, or between
sets and dictionaries, or several other operations that would be
unnecessary if we had a general interface that, for instance, let me
treat lists as sets in the first place.

So I definitely think scheme/dict and scheme/set will both pay off
over time, just as scheme/sequence has.

>> Similarly for set-eq? and set-eqv?; we should have hash-set-eq? and
>> hash-set-eqv?, because these predicates don't mean anything for "sets"
>> in general.
>
> It seems like the notion of equality used by a set is relevant to
> operations on the set. What does it mean to union two sets that have
> different notions of equivalent elements?

The union operation certainly needs to document this; there are
several options that could be seen as sane: always create a union by
adding to the first argument, or pick an arbitrary argument and warn
the user that all must be acceptable, or enforce that all arguments
must have the same representation and raise a contract error
otherwise.

> Again: I'm not going to stand in the way of anyone who wants to make
> the `scheme/set' library more generalizable and extensible. But please
> don't take away `make-set'.

Now I agree.

--Carl


Posted on the dev mailing list.