[racket-dev] proposal: `data' collection

From: Ryan Culpepper (ryanc at ccs.neu.edu)
Date: Fri Jul 2 16:33:54 EDT 2010

On 07/02/2010 11:50 AM, Eli Barzilay wrote:
> On Jul  2, Matthias Felleisen wrote:
> 
> [...]
>> I think this really gets at the questions, 
>>
>>   what is the purpose of a collect? 
>>
>> Even if we ignore the distribution idea, it should concern us that
>> we don't have a concise answer for that. Even Java seems to have
>> one. Why can't we?
> 
> (+1, and +1.)
> 
> 
>> 3. I still do not understand what Eli calls a package. 
>>   -- Is it more than a module and less than a collect? 
>>   -- Is it a bunch of collects? 
>>   -- Is it something you want to distribute? 
> 
> How about this: a package is a bunch of code (= modules) *with* a
> clear (or well defined) purpose, that does not form cycles with any
> other package. --?
> 
> This could be trivially reduced to each module being defined as a
> package -- but the purpose is a key feature here.  It should forbid
> considering a "private" module as a package on its own, and a bunch of
> modules that are all implementing some given system (like "racklog" or
> "htdp") should all be considered a single package.

I like this definition, as far as it goes, but I want to point out one
thing.

Not all purposes are created equal, and there must be a place for
packages of small purposes. For example, I've written two small
libraries that I use whenever I write gui code. It's not clear that they
should be merged into some other larger package, but it's unlikely that
each one deserves a top-level collection of its own. They currently live
in unstable because there's no better place for them. I hope that this
discussion and the one in August will determine what to do with this
sort of package.

--

Now that we know what a package is, how does it affect the naming and
organization of modules? (Eg, can we have multiple packages that live
within "data/"?)

> This is where it gets kind of fuzzy, so maybe it will help to think of
> it as a kind of a declaration: I'm telling my clients (authors of code
> that uses my "package") that I'll never use their code, and I'm
> telling my providers that they should never use my code.  This is
> obviously dynamic -- it might be that some of the functionality that
> I'm providing is useful enough that it should move up to be part of
> one of my providers (or considered into its own new package that
> becomes a provider); and it might be that one of my consumers is
> writing code that would make my life easier.  In both of these cases,
> I think that the *proper* way to tackle the changes is to move code
> between packages (even if it keeps the same owner) -- *not* to create
> the connections and leave the code where it is.
> 
> And to try a concrete example: Swindle has that `echo' thing, that
> just might be so great (it's not) that we'll want it in the core.
> Doing this is easy: just add (require (only-in swindle/misc echo))
> into "racket/main.rkt" and your done.  But this means that now the
> `swindle' collection is part of the `racket' collection in the sense
> that you cannot install the latter without the former, and that's a
> whole bunch of code that you didn't want in `racket'.  And if we're in
> the happy stage where we have a small distribution with additional
> packages that people choose from -- then we need to choose whether to
> silently make the `racket' collection bigger, or force people to get
> the swindle package because it's needed to resolve dependencies.
> Things become way better if you just take the `echo' code itself, and
> move it into `racket', so no inter-collection (actually inter-package)
> dependencies change.  Even if I'm still its main maintainer, you can
> fix a bug or extend it or change it -- and there's no problem because
> it is the `racket' package that you're maintaining too; whereas if the
> code stays where it is, then you're more likely to ask me to change
> it, which means that inside the swindle code itself I need to wear two
> hats depending on which lines I change.  Or say that we had version
> numbers for packages -- I could keep incrementing the swindle version
> whenever I wanted to, but if the `echo' code stays, it means that an
> increment to swindle affects the racket collection.
> 
> [Hopefully, it's clear why moving the code is much better than keeping
> it where it is -- but there's obviously a cost involved in the move
> itself.  It will use a different language now, it will be documented
> differently (in the `echo' case, very), tests will need to move, and
> the code is likely to be overall revised and reevaluated, and very
> likely modified, possibly even in a way that I (as the swindle author)
> will not want.

There's another cost, that of backwards compatibility. Moving a module
from one package to another means that code that depends on it needs to
update its package dependencies. If moving a module to another package
also means changing the module's name, then the client code has to be
updated too.

> Since you desperately want it, and I'm the one who
> wrote it, this whole work will need to be done by one of us.  Since
> we're both busy with other things, it would be temptingly easy to just
> defer it for later -- just add that `require' in and be done with it.
> IMO, the above problems are real, which makes this easy-way-out
> solution an offence.  As things stand, nobody will see it since
> they're distributed together anyway -- but when we run into the above
> problems and when they get to the point that they *require* a solution
> (eg, swindle gets too broken and is dropped, its copyright changes,
> its author moves to tibet and becomes a monk), *someone* will need to
> step up and solve them.  That someone will go over the code and move
> it, deal with the documentation, with the tests, fix bugs, and of
> course wash the windows and scrub all the pots that were left after
> the cooking for last nights party.  (And that gets to why I dislike
> unstable, even if someone else will do that laundry.)]

The issue I raised above is about one-third of the reason for unstable.
The other two are the no-compatibility disclaimer and the implicit
invitation to other developers to meddle, edit, and critique. If the
first problem is solved, I can see handling the other two in other ways.

Ryan


Posted on the dev mailing list.