<div dir="ltr">Can you elaborate on your intermediate form? I don't understand how git submodules prohibit or restrict submodule evolution. The only difference I see with the submodule approach is that it requires an extra commit to update the submodule versions (and subsequently a pull followed by a submodule update in other clones), whereas the makefile approach only requires a 'make update' in the umbrella clones. Is there something else I'm missing?</div>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Aug 13, 2013 at 2:22 PM, Tony Garnock-Jones <span dir="ltr"><<a href="mailto:tonyg@ccs.neu.edu" target="_blank">tonyg@ccs.neu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br>
<br>
Matthias asked me to write a few words about an experience I had splitting a large repository of code up into smaller repositories and then building a mechanism to tie them together again.<br>
<br>
== A short story ==<br>
<br>
Once upon a time, RabbitMQ (<a href="http://www.rabbitmq.com" target="_blank">www.rabbitmq.com</a>) was held in a single, monolithic Mercurial repository, including the server, the Java client library, the .NET client library, the Erlang client library, the protocol codec compiler, the documentation, adapters for other related messaging protocols, and so on.<br>
<br>
We decided for various reasons to split the monolithic repository into separate repositories. The approach we ended up taking was to have a single repository, the "umbrella", which included a Makefile and a handful of scripts which checked out, updated, compiled etc. a number of other repositories from various places. You can still see the umbrella today here: <a href="http://hg.rabbitmq.com/rabbitmq-public-umbrella/file/default" target="_blank">http://hg.rabbitmq.com/<u></u>rabbitmq-public-umbrella/file/<u></u>default</a><br>
<br>
The workflow for someone working on RabbitMQ is now:<br>
<br>
1. Check out the umbrella, and `cd` into it.<br>
2. Run `make checkout`.<br>
3. Run `make`.<br>
4. Edit, compile, debug, commit and push in the subdirectories resulting<br>
from step 2.<br>
5. Occasionally run `make update` in the umbrella.<br>
<br>
(There's also some ugly makefile machinery to do cross-subrepository dependency tracking to let `make` in a subrepo recompile just the right things. Mostly.)<br>
<br>
Personally, I frequently use a script, `foreachrepo` (git variant attached) that lets me operate on all repositories found under the umbrella at once. For example,<br>
<br>
$ foreachrepo pwd<br>
<br>
would tell me where all the checkouts live, and<br>
<br>
$ foreachrepo git status<br>
<br>
would show me their status.<br>
<br>
When a configuration is found that works nicely and is to be released, a tag is made across all the currently-checked-out repositories:<br>
<br>
$ foreachrepo git tag my_release_2.3.4<br>
$ foreachrepo git push --tags<br>
<br>
The split into completely separate repositories, linked informally by action of a script, worked out well for RabbitMQ, and the RabbitMQ project seems to be living happily ever after.<br>
<br>
== Comment ==<br>
<br>
The problem addressed here is *configuration management*. RabbitMQ takes a very loose approach to configuration management, where individual modules evolve independently and are only connected to each other by happening to be in sibling directories within the umbrella. Tags are used to take a snapshot of a group of repositories at the same time.<br>
<br>
Another approach to configuration management uses an explicitly *versioned* manifest, where an umbrella repository names other repositories *and specific versions* of their contents to pull into scope. This is taken by systems like rebar, and is essentially how git submodules work.<br>
<br>
You could frame the contrast between the two by saying that the RabbitMQ approach is essentially *optimistic*, freezing configurations only when needed, and with occasional frankenconfigurations (when you `git pull` one subrepo but not one of its siblings) a risk during development, whereas the `git submodule` approach is *pessimistic*, keeping configurations frozen until explicitly moved forward into the next frozen configuration.<br>
<br>
An intermediate form could be imagined, where the Makefile checks out specific versions or branches but otherwise leaves them free to evolve in a way `git submodule` prohibits.<br>
<br>
Vincent has recently run into issues of configuration management: he wishes to assemble a specific collection of packages at specific versions to run a particular application (namely, some benchmarks).<br>
<br>
Others on this list do similar things, assembling specific versions of libraries into complete applications.<br>
<br>
I think it's interesting that both releasing applications and releasing the Racket system itself have this problem of describing a collection of related packages.<br>
<br>
Cheers,<br>
Tony<br>
<br>_________________________<br>
Racket Developers list:<br>
<a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/dev</a><br>
<br></blockquote></div><br></div>