[plt-dev] Re: problem with optimistic compilation

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Thu Aug 13 08:38:08 EDT 2009

At Wed, 12 Aug 2009 18:49:34 -0500, Robby Findler wrote:
> On Wed, Aug 12, 2009 at 4:21 PM, Sam TH<samth at ccs.neu.edu> wrote:
> > What if mzscheme-compiled files already exist?  Does it use those?
> 
> Yes. With preference given to the drscheme ones.
> 
> > If
> > it doesn't, won't it recompile the entire collects tree the first time
> > you try to do anything?
> 
> No, the "don't save any compiled files for anything in the collects
> tree" caveat is still in place, as before.
> 
> > Also, this seems like it's papering over the problem.  We should be
> > able to come up with a solution that works for both DrScheme and
> > mzscheme.
> 
> I'll leave this one to Matthew to try to explain. I think you're
> probably wrong, but I'm not sure precisely why. It is a big hairy
> mess.

And I'm not sure precisely why, either. This one of those areas where I
suffer from the same affliction as many sysadmins: to users, there are
many changes that would be obviously better, but I've been burnt by so
many obvious changes that I'm reluctant to try more. Lots of competing
demands have been balanced through a slow evolution, and then lots of
other strategies and techniques evolve to fit that design; I'm leary of
re-living it all for a different design point.

Here's an attempt to list relevant issues and competing demands on the
general issue of source and compiled files:

 * Timestamps allow relatively efficient tracking of dependencies, but
   timestamps are also somewhat fragile.

 * There's a significant run-time cost to checking timestamps and/or
   following a search path to locate the "best" version of a file.

 * Search paths and other rules that can "fail" silently create
   mystery. (Why does my program take so long to load? Oh, I need to
   recompile file X.... But it is compiled, and the timestamp is later!
   Oh, it's apparently the wrong version.)

 * Some directories are writable and some are not. Some directories are
   writable but aren't really intended to be modified.

 * Sometimes `mzscheme' is used in development mode and sometimes in
   execution mode. (Recall how we eventually learned that defaulting
   development mode and requiring "-q" for execution mode was a bad
   idea.)

 * Sometimes files don't exist for required modules, even if they're
   named through collection paths. (I have in mind the modules that are
   in a `"stand-alone" executable.)

 * Programs might be loaded concurrently on the same filesystem, and
   synchronizing them is a pain, at best. (Are we at least past the
   days of NFS, where you don't get the normal filesystem atomicity
   guarantees?)

 * Different versions might be used.

 * Different compilation options might be used, and they may allow
   different development and/or deployment possibilities. (For example,
   `enter!' can re-load changed modules only when they are compiled
   with `compile-enforce-module-constants' set to #f, but that same
   setting has a negative effect on performance.)

 * Sometimes different compilation options are used in the same
   program, and the user needs some control over which modules use
   which options. (Currently, our tools choose to do X or Y based on
   whether a compiled file exists for a module.)

 * Using multiple namespaces or other parameter-based configurations
   can easily collide.

 * Sometimes you want to refer to compiled files outside of PLT Scheme.
   (For example, I often write makefiles that use `mzc' and then
   trigger other actions based on the timestamp of the compiler file
   --- and that would be more difficult if the compiled-file path were
   version-specific, though maybe I should not be doing that in
   makefiles.)

 * When version-specific files are generated for and users upgrade
   frequently, the filesystem can become littered with useless files
   from old versions. (This bugs me about Planet, but it's all on one
   place, so I can clean up easily enough.)

 * Although there are many cases where changing a module forces
   recompilation of importing module, there are also many cases where
   the new module can be used from source without recompiling
   everything that depends on it.

 * Bootstrapping is tricky. (I sometimes get into trouble by using
   `mzc' after I change a file in "collects/scheme" without recompiling
   everything. Because of the way that compilation hooks into the
   module-loading process, and because `mzc' itself uses the changed
   files, running `mzc' multiple times doesn't converge in a nice way.)

 * Some people want to distribute bytecode files without source.

I'm sure the list is incomplete, but that's all I can think of. In any
case, if someone believes that the current approach to bytecode files
is fundamentally wrong, the above list may be useful. My sense is that
a better approach is out there, but that it's only slightly better and
not worth the effort to get from here to there; then again, that sense
is based more on vague impressions from (limited) experience, rather
than any solid technical argument.


On the specific issue of how the changes in DrScheme relate to
MzScheme, though, I agree that the DrScheme capability should be
available in MzScheme in some sort of development mode. As Robby noted,
you can get `mzc'-like automatic compilation of files by adding

 (require compiler/cm)
 (current-load/use-compiled
  (make-compilation-manager-load/use-compiled-handler))

to your ".mzschemerc". I think it would make sense to have DrScheme's
extra tools (i.e., to compile only sources not in the main "collects")
similarly available in library form.



Posted on the dev mailing list.