[racket-dev] Generics and data structures

From: Asumu Takikawa (asumu at ccs.neu.edu)
Date: Wed May 9 18:13:55 EDT 2012

Hi all,

Racket currently provides several generic extensible data structure APIs
such as dictionaries, sequences, streams, and so on. Unfortunately, each
data structure currently has its own extension API, but no consistent
convention exists: some APIs use lists of methods, some use vectors,
etc.  Furthermore, these APIs are usually rather low-level (e.g.,
dependent on ordering in the method table).

Vincent and I have been working on a unified user-friendly extension
interface based on unstable/generics by Eli and Jay (which we have moved
to collects/generics). We have a prototype that implements our proposed
interface for prop:dict and prop:ordered-dict with complete backwards
compatibility with their respective extension APIs. That is, we have
modified racket/dict and data/ordered-dict to use generics.

The code is on github here:
  https://github.com/takikawa/racket/tree/generics

We would like to get feedback on what we have so far and if nobody has
any objections, we would like to push what we currently have to master.

Ideally, we would provide similar interfaces for the other generic APIs
in the tree, such as streams and sequences. However, the existing APIs
rely on different representations for method tables from the one used by
unstable/generics, specifically a vector of methods. This makes
backwards compatibility complicated and we're not sure how to proceed.

Some specific examples:
  * `prop:sequence`: Instead of a vector of methods, this struct
    property takes a procedure which takes a struct instance and
    produces a sequence (not a vector of methods).

  * `prop:equal+hash-code` uses a list of methods instead of a vector.

  * `prop:dict/contract` is given both a method vector and a vector
    of sub-contracts that are used to assemble the method contracts.
    The generics library currently uses a single vector.

To apply our proposal to these cases, we could change these struct
properties to use the generics library's preferred representation.
However, this would break backwards compatibility with code that
currently uses these struct properties, which makes this a non-starter.

Does anyone have any suggestions on how to proceed without breaking
backwards compatibility?

Streams present a slightly different issue. `prop:stream` already
represents its methods with a vector. However, it is used heavily in the
implementation of racket/base, so we cannot re-write it to use generics
(which depends on racket/base).

Assuming we can't restructure racket/private/for to break the circular
dependency, another solution would be to allow the generics library to
create a generic interface attached to an existing struct property[1],
given the struct property accessor.  That would require exposing the
struct property accessor from the internals of racket/base.  Would that
be an acceptable solution?

[1]: currently, the generics library needs to be the one defining
     the struct property.

Any thoughts or suggestions?

Cheers,
Asumu & Vincent

Posted on the dev mailing list.