[racket-dev] experiment reorganizing the repo into packages

From: Eli Barzilay (eli at barzilay.org)
Date: Wed May 29 14:14:11 EDT 2013

Yesterday, Matthew Flatt wrote:
> Here's a first experiment at moving collections around into packages:
> 
>  https://github.com/mflatt/racket/tree/pkg
> [...]

Comments in no particular order, mostly about the actual file
selections.  Many are kind of small todo-style laundry items, some are
repeating obvious things -- I missed starting with the readme file...
In any case, it should now be obvious that this is going to be long.


* This is *impressive*!  [=> The rest of these items are not too
  strong (at least not ATM), since it looks good enough to be good
  anyway.]

* Does this split actually work wrt having no circular dependencies?
  (It's pretty amazing if it does -- it's what I've been worried about
  for a long time...)

* I usually like the fine split to packages like -docs and -lib.
  However, minor subpoints:

  - Shouldn't the "-docs" suffix be "-doc" to be in-line with the
    directory name and also symmetric to "-lib"?  (BTW, looking at the
    Fedora installations that I have, there are both conventions, but
    "-doc" is much more common than "-docs" , and "-libs" is a little
    more common than "-lib".)

  - I think that a -lib package is for cases where there is some use
    for it by itself, and in some cases there's no need for that.
    Things that I've seen:
    - "xrepl-lib" -- there's no library functionality in xrepl that is
      useful by itself, and if there is (or if something does become
      useful), then it should move out of it.
    - "at-exp-lib" -- the "lib" in the name seems confusing (because
      there is no "at-exp", and I don't think that it's right to have
      a "-lib"-only package...)
    But I was confused:

  - Also, I'm looking at the "draw" package (and now that I see it,
    "xrepl" too), and it looks like a hack that is compensating for
    an inability to specify some "doc of" relation between packages.
    (It's useful to have meta-packages, of course, it just looks like
    the wrong level to specify such relationships.)  The fact that I
    was confused about it seems to support this...

  - I'm also worried about the repo-implications of such splittage:
    would there actually be three repositories for "foo", "foo-doc",
    and "foo-lib"?  If so, then the price of not having in-package
    specification for these things seem much higher; I will certainly
    not want to see three such repos for xrepl...

  - There are also some "-lib" packages (like string-constants-lib)
    that have no other forms.  (Especially in the string-constant
    thing, having it be a -lib seems wrong to me.)

  - Similar issues are going to be relevant for other "package kinds"
    like "-typed".  (The following sentence written after I've seen
    much more:) So the more I see these things, the more I think that
    it's bad to rely or encourage on this kind of post-pended suffix
    as an indication of the package type -- I'd rather see these
    things more formal and hopefully all distributable from within a
    single per-real-package repository instead of an extra
    meta-package for each.

* I think that "docs-index" would be better named as "core-docs" or
  something similar ("core" in the sense of making up the main doc
  pages, not in the sense of a documentation of the core).

* Given that there are a *lot* of packages, it'll probably make things
  convenient to have some meta-repository with submodules after the
  split.  Just as a convenience thing to get the whole set (or some
  useful subset) without listing them.  But not as something that
  would be used for actual stuff.  (With submodules once you commit a
  subrepo you usually want to commit the toplevel which has a sha1
  pointer to each sub-repo -- the idea is that you can commit work
  separately from what you get for the toplevel directory, but I'm
  talking about something simple like some script that will always
  update the pointer, or ignore the pointers and always use the master
  of the sub-repos.)

* As a random idea, maybe it would be a good time to move things like
  "drscheme" or the "net/*-{unit,sig}.rkt" files into some -compat
  packages?  Or maybe that's better done later on by the package
  maintainers.  (Including things like "scheme", eventually.)

* I think that the contents of drracket/rackunit should move into some
  "rackunit-gui" package, or maybe "rackunit-tool" is better.  Same
  goes for drracket/scribble (=> "scribble-tool").

* Is "browser" really needed?  From a random grep, it looks like the
  only thing that is needed there is `browser/external', which is
  actually a wrapper around `net/send-url' so maybe leave just that
  there?

* Similarly, "help/bug-report*" are obviously better moved into
  drracket...  (Obviously to me, at least.)  And also "pkg/gui" (which
  is even worse, since it looks like a weird name from a bad past
  hierarchy...).  And a few others (slideshow, texpict).

* The drracket/macro-debugger directory looks like something that
  should be in its own package too.  Generally, I think that it's best
  to avoid such no-clear-owner packages other than the more core-ish
  code.

* Also, "repo-time-stamp" should go in the core.  (And continue to be
  removed in distributed form, including the source form.)  Same for
  "version".

* Looks like "tests/eopl" got dropped!  Ah -- it's actually left
  behind in the "racket" package with a whole bunch of other tests...
  (So this is just a rough split unrelated to code dependencies.)

* I think that it's better to explode "games" too, perhaps to one
  "games-core" package, one for all of the card games (and the
  library), and then per-game packages.

* Why is there "rackunit" in the gui(-lib) package?  Seems like this
  should go with the code that I mentioned above, in a rackunit tool
  thing.  (Or maybe in a package by itself, so there will be rackunit,
  rackunit-gui, and rackunit-tool.)

* Also, I think that "framework" is big enough to be in a package by
  itself, depending on "gui", rather than be included in it.

* The htdp package has some compatibility things in the lang
  collection, like pretty-big and r5rs.  These could go nicely in a
  "legacy" or "obsolete" package, with the r5rs things going into the
  r5rs package.

* The icons package should have generic icons in it, but there's a
  bunch of specific icons that should go into their respective
  packages.

* The "lib" in "macro-debugger-text-lib" seems redundant too.

* The math package should be split further -- but this looks like a
  good example of something with a coherent design and communicative
  developers who can do that themselves after the split.  (Ie, the
  lesson here is that it's fine to have a rough split, and let people
  do their own accounting later.)

* Shouldn't "mzscheme" have some name that is more indicative for some
  legacy executables?

* If there's a future-visualizer-typed, shouldn't there also be a
  plot-typed?  (But as in the math item above, I think that it's
  better for the f-v-t to be removed and left for James to decide
  later...)

* Instead of a single "plt-services", I think that there should be
  separate packages for the web, drdr, and pkg-index directories,
  "images" could move to iplt, props could stay and eventually
  disappear with the distribution-mess, and contrib can be a separate
  package too (which will eventually get installed in the right
  places).

* "preprocessor" also seems like a grab of a precious name when it's
  actually just a legacy thing -- but maybe it's best to leave it
  there to be extended with more tools that fit the "classical" notion
  of a preprocessor.  (And what's there is going to make it easy to
  add such tools, both as code and as a template.)

* For xrepl or readline, maybe there should be some form of installers
  that add them to the .racketrc given that that satisfies the
  readline thing about users explicitly enabling them?  (Ie, some kind
  of meta-like package that does the installation.)

* In the readline package, the package info file uses an explicit
  `quote' for some reason...  Also in some of the other toplevel
  files.

* redex could be split up later too, maybe even to the point of
  separate redex-based packages instead of an examples directory.
  (But again, up to its owners...)

* The schemeunit thing should probably go with rackunit.  (Or still be
  independent as a legacy thing, but this wasn't done elsewhere...)

* In scribble, there is rackunit/docs-complete.rkt, which is not the
  right place for that with a split to packages; it is better IMO to
  move it to someplace better.

* For examples directory (ffi, sgl, slideshow, and possibly redex)
  it'd be nice to have some foo-example(s) thing.

* Why is "quick" in the slideshow docs?  I'd expect it to be in some
  standard doc package thing.

* The slideshow package has some stuff in "texpict" which is some
  really ancient compatibility -- maybe it's time to remove it?

* The info file for snip talks about a non-existent "snip-docs"
  package.

* For things with "std" docs it's nice that they moved into the
  scribblings directory.  Maybe this can eliminate the need for the
  "keep-dirs.rktd" thing?

* The "srfi-lib" should has a natural and useful split into the usual
  srfis and the r6 versions.

* In string-constants it actually makes perfect sense to have a
  generic -lib thing, and then have each language file in its own
  repository.

* Another note re the string constants for later is that maybe it
  should be extended to allow constants in separate files, so the
  translations will be in each package.  But it's probably more
  convenient to be translator-oriented and have one language per repo.
  Another option is one repo per package per language, but that's too
  extreme.  So I think that it's best to look into regular translation
  files -- and it would be really good to just switch to the same
  format and get the standard tools that people use for translators.

* I see that trace doesn't have a separate -doc package, seems
  non-uniform and I'm not sure if it's intentional.

* typed-racket-more has things that should all go into their
  respective packages, or into "-typed" variants of these packages.

* The unstable-list-lib package has a pile of non-list stuff...

* ...but I really think that the unstable stuff should go away: things
  that are used in a particular place should move back into being
  private utilities, and most other things just left behind.  There
  are several things there that are the kind of extensions that could
  be made into their own packages anyway.

* And I think that the above holds for the unstable-parameter-group
  which is used in two packages, but both are basically "Neil's
  thing"...

* The "unstable" package has the docs for a bunch of unstable things
  left in it -- my guess is that it's too much of a messto deal with,
  which would support its dissemination...

* It also has some bigger packages that really deserve their own thing
  like the latent-contracts thing: automata and temp-c.  But for all
  of these, there's no real reason to have them included as unstable
  packages: for these things, I think that the only reason they were
  in unstable is as a kind of non-core packages, so now they can be
  just that.

* The wxme-lib package has some things that seem like they might be
  packaged with other libraries (like some snips?), like the xml
  thing.

* A note about some of the text in the toplevel readme file: I really
  don't like keeping a centralized "libs" repository -- it's true that
  it's harded to build binary code, and distribution can be trickier,
  but as long as this thing is there, it means that the plt packages
  are still more special.  (Maybe there's a plan to eventually spread
  out these things too?)

* There are some administrative files that should be copied across
  multiple repositories and/or revised for each.  Specifically:
  ".gitignore" files, ".gitattributes", and ".mailmap".  The first two
  should be copied for each new repository, the last one can be
  trimmed for the actual contributors of each package, based on its
  list of committers.

* In the core racket package:

  - There is a bunch of doc/release-notes/* directories that should
    move out.

  - I wonder whether these things are really needed in the core
    package (otherwise I think that they should move out to separate
    packages or into existing ones):
    - acks
    - db (needed only when there are docs, or to generate them, right?
      would be also nice to make the sqlite part accessible as its own
      package)
    - errortrace
    - ffi/examples (move to an -examples package) and ffi/com* (into
      mzcom?); maybe also the objc and winapi things (move to gui or
      its own thing)?
    - json, xml (are they needed??  If so, is it possible to hack some
      limited versions to avoid dependencies on a fuller package?)
    - unstable (as above -- for these things, probably move them as
      utilities where they're used or make them into non-unstable)
    - Many net/* things (imap, sendmail, nntp, pop3, ...); IMO net/url
      too (and replaced by something much simpler that can deal with
      simple GETs, as in the get-libs thing)
    - openssl/test.pem (keeps tripping people -- maybe include the
      text verbatim in the docs and have people copy+paste it?)
    - rackunit
    - srfi/* (should just grab the missing functionality, "eventually"
      so maybe not now)
    - tests/* (looks like it's not done yet)

  - There's a weird .gitignore in racket/collects/pkg which looks like
    some erroneous leftover.


> Meanwhile, part of this experiment is defining "-lib" packages that
> do not provide documentation, which means they can have fewer
> build-time dependencies than they would need for documentation. The
> "-docs" packages provide the corresponding documentation. (I'm not
> sure about the naming convention; I just had to pick something for
> now.)  Naturally, the "X" package pulls in both the "X-lib" and
> "X-doc" packages.

To clarify, after staring at this for some hours, I really prefer it
if there's a way to keep it all together and do everything within the
package system.  (At least havine 3 repos for N packages sound like a
very unfun thing to deal with.)  But I do want the ability to have
these partial-contents packages built for distribution.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

Posted on the dev mailing list.