[racket-dev] proposal for moving to packages: repository

From: Eli Barzilay (eli at barzilay.org)
Date: Tue May 21 14:20:33 EDT 2013

Yesterday, Matthew Flatt wrote:
> 
> Concretely, new repositories that are just a subset of the current
> repo would be off-by-one in directory structure compared to a normal
> package. Each package should correspond to a subtree starting from
> the "collects" level, not the parent of "collects". We could massage
> the two views into one, but I'd rather not.

That's really easy to deal with, and doesn't contradict what I
suggested, *but* given:

> To put it another way and overstate a little: I'm trying to get buy-in
> from dev to make the switch to packages wholesale. [...]

And even more, given:

5 hours ago, Matthew Flatt wrote:
> I think we won't get an ideal package split on the first N tries,
> and it will be easier to move files and directories around in one
> repository (using `git mv') instead of among multiple repositories.
> When we finally have mostly the right split, then we can use `git
> filter-branch'.

I think that there's a much easier and more elegant way to do this,
which is even easier for all developers.  Roughly speaking, it's
flipping what I suggested yesterday and doing it the other way:

  * Keep the repository as-is, no structural changes at all.

  * Keep working on things as usual, including work on the package
    system and everything that is related.

  * As it gets to a workable state, keep a script that will *split*
    the monolithic repo into separate packages.  This script can start
    very simple, for example, a naive thing would be:

      cd $MAINTREE
      mkdir $PACKAGES/drracket
      mv collects/drracket collects/drscheme $PACKAGES/drracket

    Everything that deals with packages would start from a fresh main
    repo and and empty package directory, and will construct the
    packages from it.  So, for example, the build will still make each
    package independently, and distribution is still done by
    assembling packages.

  * The main point is related to what you said above: the package
    splittage is determined by the script, so if you find out that
    some file belongs in a different package, or that packages need to
    be combined, or split differently, or whatever -- this is all done
    by just changing the script.

    So you get two birds with a single stone: it's easy to experiment
    freely in the early stages, and it's easy to adjust things when
    the split converges to something that works fine.

  * When everything is working smoothly -- with the main effect being
    a resolution of dependencies, both of existing code and in terms
    of people being aware of them -- at this point it will be a good
    time to switch to separate repos, and since all developers have
    already gotten used to the package, there is now just the repo
    change, and nothing else -- so it becomes a technical point like
    switching from svn to git, not piled up on the more substantial
    change.

    As a side-effect, the final directory-splitting script can be used
    with git's filter-branch to create the new repos.

I think that this offers the best in terms of being flexible as needed
while work is in progress, and separating the changes that people need
to adjust too which should make the whole process more comfortable.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

Posted on the dev mailing list.