[plt-scheme] collection not found error at startup

From: Eli Barzilay (eli at barzilay.org)
Date: Tue Jul 1 12:51:45 EDT 2008

On Jul  1, Ari Pollak wrote:
> On Mon, Jun 30, 2008 at 2:40 AM, Eli Barzilay <eli at barzilay.org> wrote:
> > It gives you an accurate division for what's in the mzscheme
> > distribution and what's in the plt distribution.  If you don't
> > have the resources that can make a proper division (like investing
> > the time to take our script and adapt it; or write your own
> > dependency checking script), then it's better to just drop the
> > mzscheme package than getting into such problems.  (Like I said, I
> > suspect that people who would be mzscheme-only users are probably
> > people who are very comfortable getting and installing it from
> > source.)
> 
> I'm confident that most Debian users are comfortable installing
> something from source, but they'd rather have something that's clean
> and easily integrates into their system.

At the price of it being bogus and unsupported?  The CGC version has
some known problems, it can easily lead to large chunks of memory
leaking.  I can write simple code that should run indefinitely (or for
a long time) in fixed space, but will crash when it runs out of memory
with the CGC.  (See [*] below for a little more about how this can
happen.)


> And you're right, it would be easier to just create one plt-scheme
> package and ignore the people who don't want to install a bunch of X
> libraries on a server.

What I'm saying is that if you don't do a proper job in separating
what goes into the plt and the mz distribution, then you're better
switching to using our two source bundles for plt and mz which is an
easy way to get the proper subset, or just avoid the separate subset
package altogether.  Both of these solutions are better than what you
have now, because the save the most important resource we have (all of
us), which is our time.  This whole thread started when someone had
problems that were caused by poor choice of subdirectories: one of the
main changes in v4 is that much more functionality has moved from C
code to Scheme libraries -- in the collects/scheme directory, so not
including that directory (and having collective time spent looking for
a cause) is a major indication that something in these decisions is
not going right.

In the meanwhile, my personal conclusion from all this is "the debian
package does things wrong", which means that I'm more likely to reply
in the future with "you're using the debian package, please try using
our installers or source tarballs", which means that you get to deal
with more problems.

And BTW, I've seen a good number of people (including a debian case
just last night) installing the mzscheme package because they don't
need a gui ide -- they completely miss out on a bunch of things they
don't get, like documentation, or like a library tree that is known to
have complete dependencies.


> >> But since mips & mipsel aren't even building CGC properly anymore,
> >
> > If this is based on the 371 sources, then please try again.
> 
> No, that was based on 4.0.1:
> http://buildd.debian.org/fetch.cgi?pkg=drscheme;ver=2%3A4.0.1-1;arch=mips;stamp=1214351098

This is a good example of what I'm talking about.  What I get from
this is that there's a segfault on an architecture I don't have access
to, and the information that I have is:

* It happens in a build process that I'm not familiar with (eg, looks
  like `configure' runs several times),

* When the configure script is heavily customized with options (it's a
  .5kb command line), including using cgc default, which can
  definitely affect the installation process because it's something
  that can be too stressful for the cgc collector,

* With a bunch of patches, including `00_debian-nonstandard-install'
  that changes the way mzscheme is searching for libraries when
  running for the install,

* Under all of these circumstances, there is a segfault problem that
  looks like this:

    [...]
    setup-plt: making: srfi/25
    setup-plt:  in srfi/25
    setup-plt: making: srfi/26
    setup-plt: making: srfi/27
    setup-plt:  in srfi/27
    make[2]: *** [install-cgc] Segmentation fault
    make[2]: Leaving directory `/build/buildd/drscheme-4.0.1/build'
    make[1]: *** [install] Error 2
    make[1]: Leaving directory `/build/buildd/drscheme-4.0.1/build'
    make: *** [common-install-impl] Error 2
    dpkg-buildpackage: failure: /usr/bin/fakeroot debian/rules
    binary-arch gave error exit status 2

Now, to debug this, I'd obviously try to isolate the problem by
removeing all patches, dropping all non-standard customizations, gain
access to a machine where I can recreate the crash and go on from
there.  The above points are working against this debugging process in
almost every possible way.  I have no useful information, and I see
enough changes that the problem can arise from the modified sources/
configure-options/etc.

----

[*] Here's a brief description of how CGC is bad:

It is a conservative collector -- no changes to C code are necessary,
and the GC just scans everything assuming that all roots are pointers.
Usually, this is not a problem, since misses are rare, and the numbers
are usually short-lived so the memory will eventually be GCed.  An
example where this fails is when you have an infinite stream that
you're cdring down -- on one end you're forcing promises that generate
more cons cells and on the other end you're cdring down these cells.
Now, what *might* happen is that you end up with a number that looks
like a pointer to one of these cells -- so the cell will not be
collected, including what it points to -- which means that from now on
none of this will be GCed, and you'll get your memory filled up until
you crash.  This "might" is statistical -- it happens with a very low
probability; but in a situation like cdring down an infinite lazy
stream, this goes all the way up to a very high probability of it
happenning.  The compilation process can also be fragile in the same
way: it has a lot of stuff in memory, and it keeps loading and
unloading code.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the users mailing list.