[racket-dev] Splitting the `pkg` implementation

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Thu Jun 27 19:04:02 EDT 2013

At Thu, 27 Jun 2013 18:28:50 -0400, Sam Tobin-Hochstadt wrote:
> As part of making the "core" of Racket smaller, I'd like to propose
> separating out part of the package system implementation. In
> particular, I'd like to make the core portion of the package
> collection not use the network.

That's a different direction than I had imagined.

I imagine that (in the future) people will clone just the core repo,
build, and then use `raco pkg install' to install packages from other
servers --- probably pre-built packages by default, but optionally (and
more likely for a non-release version) from a source catalog server
that points to GitHub repositories.

More generally, the network seems to me the main way to get packages.
The way that our repository's makefile sets up links (and therefore
doesn't need the network) or otherwise locally constructs packages
seems in contrast, like a relatively unusual way to install packages.


> This would allow us to remove the
> `net` and `json` collections, something like 7500 lines of code, from
> the core. [1]
> 
> I think, from looking at the code, that there are two major ways the
> network is used right now: (1) fetching packages named at remote
> locations, and (2) fetching the contents of the remote catalog.
> 
> For (1), I think this should be made extensible.  There should be an
> info.rkt field, named say `pkg-installer`, which specifies [2] a
> function like `package-source->name+type` and also a function like
> `stage-package/info` as well as a type. Then the relevant functions
> use `find-relvant-directories` to find all the extensions, and call
> them in order to see if they match. This would both allow parts of the
> current implementation to be taken out of the core (such as the
> GitHub-specific code) but allow people to extend the package manager
> with new ways of specifying packages, such as with git urls directly.
>
> For (2), the case is a littler harder. All the places I can find that
> use the network in this case are inside `catalog-dispatch`, and use
> `read-from-server` to talk to the network. I propose that
> `catalog-dispatch` pass `read-from-server` as an extra argument to the
> server callback, and only call the server callback if the `net/url`
> library is available.  I'm not entirely happy with this test, though.

I like this idea for extensibility, but I worry that break the existing
support down too finely will complicate a build process.

If Github support isn't in the core, for example, then you need to
install a package some other way to get GitHub support. That is, you
need some sort of source-package catalog server that doesn't point to
GitHub repositories --- at least for the packages needed to get GitHub
support, and so on.


> Finally, there's a lot of url manipulation that doesn't use the
> network.  I propose to move the portion of `net/url` that does this to
> a separate `net/url-struct` library, which would be re-provided by
> `net/url`.
> 
> Sam
> 
> [1] We could move (and I plan to) most of this either way -- the core
> doesn't depend on `net/imap`.
> [2] Probably by specifying a module path that would provide a few known names.


Posted on the dev mailing list.