[racket-dev] package-system update

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sat Jul 13 14:56:42 EDT 2013

Here's a big-picture update of where we are in the new package system
and the conversion of the Racket distribution to use packages.

This message covers

 - how I see things working after the package system and
   reorganization is done, and a report on what pieces are still
   missing to reach that vision;

 - a look at how we got to our current design/reorganization choices
   and whether we're choosing the right place; and

 - speculation on why the package changes have been so difficult to
   implement.

All of that makes it a long message (sorry!), but I hope this message
is useful to bring us more in sync.


A Package-Based Racket
----------------------

Let's take a look at how you'll do various things in the new
package-based Racket world.

(There's no new information here, and parts marked with "[guess]" are
especially speculative.  Still, some details may be clearer than in
earlier accounts, now that much of it is implemented, and I think a
comprehensive review may be useful.)

** Downloading release installers from PLT

The "www.racket-lang.org" site's big blue button will provide the same
installers that it does now, at least by default. That is, the content
provided by the installer --- DrRacket, teaching languages, etc. ---
will be pretty much the same as now.

The blue button might also provide the option of "Minimal Racket"
installers, which gives you something that's a small as we can make it
and still provides command-line `raco pkg'.

** Downloading installers from other distributors

There are all sorts of reasons that the "main distribution" from PLT
might not fit the needs of some group. Maybe the release cycle is too
long or at the wrong time. Maybe it includes much too much, much too
little, or almost the right amount but missing a crucial
package. Maybe the group wants something almost minimal, but still
with a graphical package manager. Maybe some group uses a platform for
which PLT does not provide an installer.

For many of those groups, using a "Minimal Racket" installer plus
selective package installations will do the trick. For others,
creating a special set of installers might be worthwhile, but there
are too many reasons and too many permutations for PLT to provide
installers that cover all of them.

Fortunately, anyone can build a set of installers and put them on a
web page, and we make it as easy as possible to build a set of
installers that start with a given set of packages. PLT could host a
web page or wiki that points to other distributors. PLT might even be
able to provide an automated service that generates a set of
installers for a basic set of platforms.

** Compiling a release from source

In addition to installers, a download site can provide a source-code
option (not specific to any platform, unlike the current source
packages), which would mainly be used for building Racket on
additional platforms.

This option is mostly a snapshot of the source-code repository for the
core, but it includes a pre-built "collects" tree (see "technical
detail", below) and a default configuration that points back to the
distributor's site for pre-built packages.

** Adding or upgrading supported packages

In much the same way that you can easily install a set of supported
packages on your current OS, you'll be able to easily install a set of
packages that are supported by your distributor. Those packages are
pre-built, so they install quickly, along with any included
documentation.

Depending on the distributor and installer, packages might be
downloaded and installed in "binary" form, which means that tests and
source code (for libraries and documentation) are omitted from the
package. PLT seems unlikely to provide such installers in the near
future.

The default package scope configured by a distribution tends to be
"user", which means that packages are installed in a user-specific
location.

Package updates can be made available by distributors for whatever
reason and on whatever timetable see they fit.

If your distribution is from PLT, then the supported packages are
called "ring-0" packages. Ring-0 packages include contributions from
third parties (i.e., not just packages implemented by PLT) that are
vetted and regularly tested by PLT.

[Guess:] The "Racket" and "Minimal Racket" distributions might point
to different pre-built package catalogs. Possibly, the "Racket"
catalog never updates packages that were included in the installer (on
the grounds that the user may not have write permission to the
install), while the "Minimal Racket" catalog includes more frequent
updates for bug fixes (on the grounds that the user can update any
installed package).

A distributor doesn't necessarily have to provide its own package
catalog. It can instead supply an installer that works with packages
as served by some other distributor's catalog, such as PLT's
catalog. (See "technical detail" below.)

A user can also redirect `raco pkg' to a different catalog server,
instead of using the configuration that was supplied by the
installer. Binary, pre-built, and source variants of a package can be
"updated" to each other in any direction.

** Adding or upgrading other packages

An installer-provided configuration will normally point to a catalog
of packages that are not specifically supported by the distributor but
are still readily available --- probably mostly in source form and
directly pulled from a git repository. In particular,
"pkg.racket-lang.org" provides packages in source form.

** Reading documentation

A distribution site provides online documentation (including all
supported packages) alongside installers and packages.

Many installers and packages include documentation to be installed on
a user's machine, but there are some packages that provide libraries
without documentation. For example, "gui-lib" provides GUI libraries
without local documentation, while "gui" combines "gui-lib" local
documentation and the libraries.

Sometimes, documentation that is installed locally will still refer to
documentation that is not downloaded. Such links are directed back to
the distributor's site. That situation won't happen often for
pre-built packages, because links that go to other packages will tend
to go to packages that are dependencies. It will happen more for
binary packages, because the dependency can be build-time only.

** Creating new packages

A minimal package is a directory. So, let's suppose that you have some
modules in a directory that you want to turn into a package. Suppose
that your directory is called "potato", and it has module a file
"eat.rkt".

Turn your directory into a locally installed package with

   raco pkg install --link potato

Then, you can use "eat.rkt" with

   (require potato/eat)

To give your package to someone else, you could zip up the "potato"
directory as "potato.zip", and the other person would install with

   raco pkg install potato.zip

Note that you can use any zip archiving tool, or you can use

   rack pkg create --form-install potato

to create the ".zip" file, which has the advantage that directories
like "compiled" and ".git" are omitted.

Even better, maybe your directory is already on GitHub at
"http://github.com/idaho/potato". Then, others can install your
package with

   raco pkg install github://github.com/idaho/potato/master

If you push changes to your GitHub repository, others can get them
with

  raco pkg update potato

If you're ready for the world to use your package, then go to
"pkg.racket-lang.org" and point the package name "potato" at your
GitHub repository. Then, not only will others know about your package,
they'll be able to install it with

   raco pkg install potato

Finally, if you'd like PLT to include your package as a pre-built
package with each snapshot and release, then go back to
"pkg-racket-lang.org" and request ring-0 status for the package.
Ring-0 status may require a few bureaucratic improvements to your
package, such as including an "info.rkt" file if you don't have one
already, because those details are needed to keep your package in
working order.

** Using the cutting edge

PLT provides one or more snapshot sites that work the same as the
release site, except that each snapshot's catalog expires after a few
days. When that catalog goes away, you can continue to use the
snapshot, but you'll have to get packages and updates via source.

** Using the bleeding edge

A user who wants to work with the minute-by-minute latest can start by
cloning the core Racket git repository, `configure', `make', and `make
install' to get a Minimum Racket build. Then, start installing
packages with `raco pkg'.

The default package catalog in built-from-source Racket is
"pkg.racket-lang.org", which means that you get all packages in source
form from various git repositories, including for PLT-maintained
packages. The default package scope is "installation".

If you run `raco pkg update -a', then you likely get updates and
trigger many compiles. Eventually, an update will fail, because your
core Racket version is too old, and you'll need to `git pull',
`configure', `make', and `make install' --- if you haven't been doing
that, anyway. Since packages were added with installation-wide scope,
`make install' rebuilds your previously installed packages, too.

** Using the bleeding edge as a PLT developer

As a convenience to PLT developers, who tend to work on a particular
set of packages, there is an alternate way of working on the bleeding
edge (which anyone can use, if they prefer).

[Guess #1:] Instead of cloning the core Racket repo, clone a "main
distribution" repo that has the core Racket repo as a submodule, plus
git submodules for each of the packages that are dependencies of
"main-distribution". In other words, you get something that looks like
the current Racket repo, but that uses git submodules.

[Guess #2:] Instead of cloning the core Racket repo from GitHub, you
clone from the "main distribution" repository, just like now. In
addition to being mirrored to GitHub directly, individual parts of the
"main distribution" repo are mirrored as GitHub repositories, and
the mirrors are the ones that "pkg.racket-lang.org" references.

GitHub repositories that correspond to packages (submodules in guess
#1, mirrored subtrees in guess #2) are registered with
"pkg.racket-lang.org", which is how users on the bleeding-edge might
normally get the packages.

** Becoming a distributor

If you want to create installers like PLT's, then it's simplest to
clone the git repo like a PLT developer, and then use `make
installers'.

Alternatively, you can use `make installers-from-catalog' to create a
set of installers based on packages pulled from a specified catalog.

Either way, if you want to piggy-back on some other installer's set of
pre-built packages, then configuration options and/or makefile targets
to do that. (This is more sketchy; see below.)

** Taking your own snapshot of Racket and packages:

Sometimes, you don't need to build installers, but you'd still like a
snapshot of the current Racket core and package. You might want to
edit the snapshot to upgrade some packages while keeping others the
same.

The `raco pkg catalog-copy' command is one of many tools to manipulate
catalog servers. For packages that are mapped to GitHub repositories,
merely copying a catalog doesn't archive the code, but it archives a
particular commit id. It's always possible to grab a copy of a package
repository and reference the copy from a catalog.


A Technical Detail
------------------

Starting from scratch twice with the same Racket sources does not lead
to compatible pre-built packages, unfortunarely, because bytecode
files are generated deterministically. Maybe we'll be able to fix
that, one day.

Meanwhile, pre-built packages depend on a particular build of the
libraries in "collects", as well as a particular build of any
dependencies. So, if a distributor wants to enable other distributors
that use the same catalog of pre-built packages, the distributor must
serve a "collects" tarball, too. Providing the "collects" will be
built into the snapshot support.



Posted on the dev mailing list.