[plt-scheme] Unix install

From: Lauri Alanko (la at iki.fi)
Date: Tue May 2 17:42:45 EDT 2006

On Tue, May 02, 2006 at 07:27:47AM -0400, Eli Barzilay wrote:
> On May  2, Lauri Alanko wrote:
> > I had always assumed that the file organization and the package
> > format were deliberate design decisions rather than acknowledged
> > deficiencies.
> 
> Like Matthew said, the file organization is friendlier to everything
> in a single directory which is the only layout on Windows, and it's
> working on Unix too.

Ah, right. It wasn't considered worthwhile to customize the file
hierarchy for various platforms individually, so a single one that works
somewhat reasonably on all platforms was selected.

> First of all, the current format is based on existing tools: base64
> and gzip.  It is possible to edit it as quack does.  Below that there
> is a convenient to edit format for the contents.  What's missing?

Just minor points: having the whole file be plain base64 text without
any headers means that it cannot be recognized as a package of any sort.
Hence I get:

$ file -i webserver.plt 
webserver.plt: text/plain; charset=us-ascii

Besides, base64 is at times a useful encoding for transporting over
unreliable networks, but it makes no sense specifying it as an integral
part of the file format. The .plt format conflates three distinct
concepts: the actual organization of the data, compression, and ASCII
armoring. And instead of being named .plt.gz.b64, it's just ".plt", to
further obscure the structure of the data.

And although the actual internal data format (the one that starts with
magic "PLT") is not too bad, it just seems like reinventing the wheel:
there are already a gazillion ways to pack multiple files into a single
one. Granted, being purely text-based is a bonus, or would be if the
content wasn't afterwards scrambled with gzip and base64.

And the special support in quack just demonstrates my point: if zip/jar
or tar had been used instead, then there would have been no need for
special support code, since existing common tools (arc-mode and tar-mode
in this case) could have been used. For instance, deb packages are just
ar archives with some tarballs inside, and rpm packages are just
.cpio.gz with some sugar on top. The .xpi packages for firefox
extensions are just zip files that contain jar files (which are also zip
files). Common formats make life easier for everyone, and they shouldn't
be deviated from without good reason. I'm not sure I see one here.

Finally, the only "official" tool for manipulating .plt -files is
setup-plt, which doesn't really offer very many choices: the only thing
you can do with a package is install it. Unlike an "ordinary" package,
there is no way to just list its contents, or extract it into a
temporary directory and see what's inside it.

So, to summarize. We are given a file. When we look at it, we just see
endless hex codes. When we run "file" on it, we get no useful
information. If we finally figure out that we need to base64-decode and
gunzip it, we are left with a novel (albeit text-based) format. If we
don't figure out the format, we have no common tools for examining the
file. The only thing we can do with it is run a special program that
will directly install it, and hope for the best.

Now, I can't help it but to me this reeks of a condescending attitude
toward the user: "Don't trouble your little head with what's in there.
See? It's all just digits and letters, nothing of interest to you. Just
run setup-plt and let us install it for you. It is a neat software
package, that is all you need to know. We know what's best for you,
trust us."

I certainly realize that this is just imagination, but nevertheless it
is a strong gut feeling I get from needlessly obscure file formats.

> > But, for what it's worth, here's how I think the PLT software
> > _should_, in an ideal world, be installed on a "standard"
> > (FHS-compliant) Un*xish filesystem hierarchy. Here "..." is stands
> > for the installation prefix, typically /usr or /usr/local.
> 
> This would be the last step in the current batch of changes.

Neato!

> If you have any comments please mail me off-list.

Since people on the list have expressed interest in this topic, I'll
still cc the list. I get the feeling that some major changes have been
underway in relative silence, and it might be a good idea to be a bit
more open about the development roadmap.

> > In any case, names like "help-desk",
> > "web-server" and "slideshow" are perhaps just a bit too generic for
> > placing in a directory that is shared with other software. Maybe
> > they should be prefixed with "plt-" or renamed altogether?
> 
> These are indeed the three problematic names.  I had the "plt-" prefix
> thing, and added the alternative path idea now.

Note that an alternative path should only be used for binaries that are
not indended to be run directly by an ordinary user. (For example, all
the various xscreensaver gimmicks are under /usr/lib/xscreensaver.) I'm
not really sure if any of the PLT binaries fill that description. Even
help-desk works just fine as a standalone program. One might, for
example, use emacs and mzscheme for coding, and help-desk for browsing
documentation, without ever using drscheme.

> > Common headers: .../include/plt
> > 
> > Again, "common" means anything that is needed for embedding.
> 
> Why differentiate embedding from other headers?  They should all be in
> there.  (Most are used for both situations anyway.)

Sorry, I just used "embedding" as a generic name for all interoperating
with mzscheme from C code. So of course schemex.h is included.

> > However, since the usual way is to compile everything once and for all
> > during installation, then .../share is probably the right place for the
> > .zo files. Especially so if .zo files ever get installed alone,
> > _without_ the source.
> 
> We had some talking about this.  Some of the `compiled' directories
> have native subdirs (sorted by platform), and there are even a few (1
> currently) .zo files that are platform dependant.

Wow. How does this happen? Some FFI macros expand differently on
different platforms?

This relates to a point I raised a while ago: how should expansion-time
variance be handled? I think the original idea is that syntax expansion
(and hence compilation to .zo) is a wholly deterministic process and
only guided by the source code and the libraries it depends on. However,
there are various ways in which the syntax expansion can quite
reasonably be altered. For example, choosing at compile time whether to
enable assertions or debug dumps. However, once such parameters are
allowed, there's no single answer to what is "the" .zo file
corresponding to the source file. Expansion-time platform dependencies
are a similar parameter.

I think Java got this one right with assertions. They don't resolve
assertions at compile time, like C does, as it would result in a mess
with having different compile options produce observably different
.class files. Nor do they resolve assertions at run-time, as it would
incur a run-time penalty. Instead, they are resolved at _load_ time,
when the JVM loads the class file. This seems like an optimal solution:
the .class files are always the same, and there's only a small
once-per-program-execution overhead for each class.

I think a similar approach would be useful for mzscheme, too: if
something should for performance reasons be resolved prior to run-time,
then it should be resolved when the .zo gets loaded, not when .ss gets
compiled to .zo. Alas, I don't think there's any infrastructure for this
kind of thing, and of course this wouldn't work with compilation to
native code...

> The conclusion of the discussion was that it would be easy and
> acceptable to have all collects in .../lib/plt (Ari mentioned that
> this is already done by python, perl, and ruby).

It may be easy and acceptable, and it is also done by e.g. ocaml and
hugs. Still, it is technically wrong. FHS states quite clearly
<http://www.pathname.com/fhs/pub/fhs-2.3.html#PURPOSE22>:

	/usr/lib includes object files, libraries, and internal binaries
	that are not intended to be executed directly by users or shell
	scripts. [22]

	[22] Miscellaneous architecture-independent application-specific
	static files and subdirectories must be placed in /usr/share.

Still, I realize that conforming to this would require quite a bit of
effort, so it may make sense to defer it to unforseeable future.

> > It is also a bit inconvenient that all the doc.txt -files have the
> > same name: I often have a number of them open in emacs, and it's a
> > bit hard to know which is which when all you have are buffers named
> > doc.txt, doc.txt<2>, doc.txt<3> etc.

>   (require 'uniquify)
>   (setq uniquify-buffer-name-style 'post-forward)
>   (setq uniquify-after-kill-buffer-p t)

I should already have learned never to mention emacs with a problem,
since there is always some emacs-specific workaround. :)

But that's pretty neat. Thanks for the tip.

> Judging by the single stow-specific prefix you're using below, it
> looks like this stow thing uses a single directory to know which files
> it spreads out later -- so it uses a single directory to solve the
> unix-multiple-directories problem.)

Well, yes. It just manages symlinks. It effectively gives the best of
both worlds: the actual contents of each package are in separate
directories, which makes uninstallation easy, but all the contents are
accessible via symlinks from a common directory hierarchy.

So, in a sense, I'm not opposed to application-specific subdirectories,
as long as the directory hierarchies within them are compatible with the
standard bin-lib-share-man-doc-include organization.

> Of course nobody would agree with you if they come from the
> single-directory world of Windows or OSX, or if they're used to the
> convenience of user-local installations by putting the plt directory
> somewhere in your home dir and linking executables in your ~/bin/.

I used to do this too. Then I wanted symlinks for the man pages, too.
And for the libraries and includes, when doing development with a
locally-installed library. Then I found stow, which makes all this
automatically.

>    ---->   /usr/notes/...    --> /usr/share/doc/plt/...
>      ARI: /usr/share/doc is generally used for distribution-specific
>           documentation, so applications don't usually install things there. I
>           think it would be better to just put the documentation & notes in
>           /usr/share/plt/{doc,notes} .

This is a bit confusing. On debian, /usr/share/doc/package contains
whatever (non-man, non-info) documentation is available for the package,
distribution-specific and otherwise. Often the package maintainers have
to explicitly move the docs there, since they don't get installed
anywhere in the upstream sources.

But to my understanding we aren't talking about debian-specifics here,
but rather about what happens during "make install" in a pristine
upstream source tree. Maintainers of distribution-specific packages will
want to modify this behavior somewhat, though of course we'd like the
upstream to play nice and require minimum effort to be customized for
distributions.

The default scenario for "make install" in autotools-based software is
to install under /usr/local, and I certainly think it makes sense for
PLT docs to be installed under /usr/local/doc/plt.


Lauri


Posted on the users mailing list.