[racket] Worried about the new package manager not storing each version of a package

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Mon Aug 26 09:27:11 EDT 2013

At Mon, 26 Aug 2013 07:57:05 +0100, Lawrence Woodman wrote:
> I have been really impressed with Racket after using it for a month, but am
> worried about the move away from a central repository for storing each
> version of a package.  I can see the advantage and simplicity of the new
> system, but worry that relying on package creators to manage their packages
> correctly could be creating a house of cards and see several problems 
> with this:
> 
>      i.  If a package owner releases a change that breaks the API [...]
> 
>      ii. If the owner of a package stops hosting it [...]
> 
>      iii.  When used with github, most people will point to their master 
> branch, which
>           if being used collaboratively could be quite unstable. [...]
> 
>      iv.  It is hard to identify bugs and fix bugs while supporting 
> users of a package if
>           you can't identify which version they are using.
> [...]
> Has any thought been given to any of these problems and are there any plans
> to mitigate them?
> 
> One easy improvement, when using github, is to allow/ensure package 
> owners point to a
> specific release/tag .zip file and not worry about the checksum as 
> nothing is going to
> change until a new release/tag is specified.

I think our tools for explicitly managing the package
name-to-implementation mapping --- i.e., a catalog -- can address these
problems.

A post about those tools from the `dev` list in April was for a similar
context,

 http://lists.racket-lang.org/users/archive/2013-April/057580.html

but I to extract the relevant part below, and I've edited it slightly
to drop a few asides that are not relevant here.

The explanation below does not address the problem of package hosts
disappearing. I can imagine that for some projects and some packages,
it would make sense to copy the package implementation and point a
catalog at the copy.

------------------------------

A _package name_ is something like "mischief", which you use for
installing and declaring dependencies. A _package implementation_ is
something that you download from, say,

 https://github.com/carl-eastlund/mischief/tarball/     fe7119517a4dcd3f5c509735a7b5a5664151c14f

Note that a package implementation in this sense corresponds to
specific revision of a pile of code, such as a particular commit in a
git repository. The package manager includes the concepts of a "package
source" and a "checksum", which together tell you how to get a package
implementation. (That implementation may have its own version number,
but such a version number is in principle orthogonal to the package
implementation's checksum.)

The mapping from a package name to a package implementation is provided
by a "catalog". PLT provides a catalog server at pkg.racket-lang.org,
but you can make your own catalog (as a server or on a local
filesystem), and so you can precisely control the mapping from package
names to packages.

Furthermore, we've added tools to `raco pkg' to make it easier to
manage catalogs. For example, if you want to take a snapshot of the
current pkg.racket-lang.org and use that from now on (so that the
mapping from package names to packages doesn't change), use these
commands:

 raco pkg catalog-copy  https://pkg.racket-lang.org /full/path/to/catalog/
 raco pkg config --set catalogs file:///full/path/to/catalog/

You can modify the files generated at "/full/path/to/catalog/" by hand
in a fairly obvious way. Or you can upload the directory to a
file-serving HTTP site and point installations to the uploaded
directory as the catalog. There's also an option to use an SQLite
database as the format for a catalog, which is a better option if you
want to modify the catalog programmatically via `pkg/db', but an SQLite
database is less easy to use from a file-serving HTTP site.

In particular, I can imagine having a project whose source code
includes a package catalog. To upgrade a particular package, I'd change
the catalog and `raco pkg update'. When I commit a particular revision
of the source code to a git repository, the package catalog is saved;
then, I can roll pack the project (including its references to specific
package implementations) to any previous version with its associated
package implementation via a `git checkout' (or whatever) plus `raco
pkg update'. Working this way, the package catalog acts a lot like git
submodules.


Posted on the users mailing list.