[racket-dev] proposal for moving to packages: repository
An hour and a half ago, Matthew Flatt wrote:
> I used to think that we'd take advantage of the package manager by
> gradually pulling parts out of the Racket git repo and making them
> packages.
(Generally, +1. I'll reply just on the repository point here.)
> This plan has two prominent implications:
>
> * The current git repo's directory structure will change. [...]
I very strongly object to this. While in theory git will follow
everything, this requires doing some more work which most people won't
know about, so a result of all of this is going to be loss of
historical information. So I think that it's much better to move
directly to several repositories (IIUC, one repository for each
suggested toplevbel directory).
The only goal of the intermediate state seems to be providing some
gradual change before switching to submodules -- and on one hand, I
think that the new layout will force people to learn how to deal with
it, and on the other, it'll make people spend work twice, once on the
layout change and again on the switch to modules.
So assuming that a gradual change is the goal, I think that there are
better ways to do that. Here's a suggestion:
* The main repository is split into the different repositories.
Initially, this is done without any consideration for submodules,
with the idea of having "advanced gitters" come up with their own
solutions.
* However, don't remove the main repository, just keep it as an
aggregate of the content that is found in the split repositories.
If the structure is going to be the same in all of them (ie, the
same directories and files are in all as they are now in the
single repository), then pulling changes from the new repos to the
main one is going to be trivial to the point of being automated.
* The new repos will not get mirrored on github. This is because
github repos come with a bunch of functionality that is best kept
in a single place -- like wiki pages and issues. (But see below.)
* So the only difference would be for people who commit work to the
main repo. This can be done in various ways, depending on the
developers who do these commits:
- Advanced developers would have all of the repos and will push
directly to them. This group of people is likely to start
small, and evenetually have all of the core committers in it.
("Core" as in the people who push to the plt repo now.) As I
said above, this will likely involve some experimentation for
these people, which will later get translated into easy setups
that will allow more people to switch to it.
- "Outsiders" can continue to work as usual: fork the main plt
repo (mostly on github) and send pull requests. The pull
request will then be pushed by a core committer as it is done
now, where the core committer pushes to the actual relevant
repo, and that eventually propagates back to the main repo so
that the contributor sees that the work was merged. The merging
should usually be trivial, except in extremely rare cases where
the push touches on files from different new repos. In these
cases it should be possible to either split the commit into
different ones for the different repos, or ask the contributor
to split the commit to different ones for the different files.
- The only people left are core committers who will work with the
main repository. I can see a bunch of ways to deal with this.
First, the commit can be sent as a pull request to one of the
advanced gitters who will then do it for the actual repository.
This is easier than it sounds: git has a bunch of commands to do
this, and for all practical purposes, you'd just replace the
"git push" part of your workflow with "git send-email". I
*think* (but I'm not 100% sure) that this work can be automated
too, so it's fine if I (or some other excited soul) gets these
emails and merges them.
There is an inconvenience point here: once you send a pull
request and its merged, the actual commits that are merged (to
the main repo, which you're using if you're in this group) are
different objects. This is nothing new -- it's something that
people who do all contibutions via pull requests deal with,
since we have a policy of rebasing rather than merging.
Usually, when you pull from the update repo, git should notice
that your changes are already there. (At least I hope it does.)
Things will be less convenient for people who use git more
intensly: if you have lots of branches etc. But I think that
such people really should just move to the first group sooner...
* This stage can go on for a while, as the code & machinery involved
evolves to a point of being smooth enough. By smooth, I mean that
- it be easy enough to build the whole thing as you do now,
- nighly builds, drdr, etc, are all adapted to the multiple repos,
- most people feel comfortable with multiple repos, specifically,
people who will need to switch their work from the big repo with
only a small part that they're actually interested in, to having
a plain build of the whole thing with only the interesting part
coming from a checked-out repository. (These are currently core
committers, and they are likely the last to switch to multiple
repos.)
* Once the new repos work fine for most people, switch to having
them as the main place: start mirroring the repos on github (and
elsewhere), and remove the monolithic one.
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!