[racket-dev] Intro to 3m?

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sun Mar 11 10:56:46 EDT 2012

Hi Nick,

Here a few notes and thoughts:

 * On the build process: The makefiles build `racketcgc' first, which
   uses the Boehm collector by default (or the slower but more portable
   SGC collector by request) and compiles the Racket C code as-is.
   Then, the makefiles run `xform', which is a Racket program, to
   convert most of the C source of Racket to better cooperate with the
   GC. Finally, the makefiles compile the converted C source to arrive
   at `racket3m', which is renamed to `racket' on install.

 * On how the source is organized: The 3m GC is in the "src/racket/gc2"
   directory and mostly in the "newgc.c" file, while the Boehm GC is in
   "src/racket/gc", and SGC is in "src/racket/sgc". (We really should
   rename some of those directories to make the organization more
   clear.)

   The main Racket C source in "src/racket/src" is mostly independent
   of the GC, although preprocessor directives such as `#ifdef
   MZ_PRECISE_GC' are sometimes used for 3m-specific code.

 * On GC logging: Logging a message from the GC can be tricky, because
   logging triggers allocation. The GC logging that's currently in
   place works by waiting until after the GC is complete (and the
   system is back in normal mode) before logging a message. I'm not
   sure how well it would work to try to log a message on every
   allocation --- and somehow treat the logger's allocation specially.

 * Try `dump-memory-stats': In case you haven't discovered it already,
   `dump-memory-stats' tells you more about how current memory use
   breaks down into different kinds of values.

 * Backtraces may help: If you use `configure --enable-backtrace' when
   building, then `dump-memory-stats' can show you paths to specific
   types of objects in the heap. Backtraces are useful for tracking
   down leaks, especially if you have some idea of the kinds of records
   that are being leaked and you want to know why the GC thinks the
   records should be retained. Backtraces are less useful for tracking
   down the source of allocations, though.

   Enabling backtraces also enables some other support in
   `dump-memory-stats', including support for walking through all
   memory. I'm not sure how well that still works, though, since it
   hasn't been used in a while.

 * Instead of putting `(current-memory-use)' everywhere, you might
   consider creating a separate thread that periodically calls
   `(current-memory-use)' and uses `continuation-marks' on the main
   thread with `continuation-mark-set->context' to extract the thread's
   context. That's how the time-oriented profiler works, for example.

Hope that helps,
Matthew

At Fri, 9 Mar 2012 11:48:28 -0800, Nick Sivo wrote:
> Hello,
> 
> I'm optimizing some code and have used gc-info logging
> to correlate substantial application pauses with GC occurrences.  This make
> sense.  The obvious solution is to allocate less memory, but tracking down
> where it's coming from isn't easy.  In the short term I plan to inject
> (current-memory-use) based logging all over the code I'm optimizing, but
> would like to develop and contribute something more generic and re-usable
> if there's interest in it.
> 
> My ideal goal is to add logging that reveals which code is allocating
> memory (how often, how much), and also, for each GC pass, the origin(s) and
> final resting place of what was collected.
> 
> So now I'm trying to learn more about Racket's 3m garbage collector so I
> can instrument it.  I've scoured the guide and reference, but found nothing
> that gets into the details of the runtime/collector interaction and in
> which code files that lives.  Before I start grepping through the source
> and trying to piece it all together on my own, I figured I'd ask.  Some
> pointers in the right direction, even vague hints of where to start, would
> be awesome :)  Even a description of the build process - where xform comes
> into play - would be infinitely helpful.
> 
> Please also feel free to tell me I'm crazy and point out something
> obviously simpler I'm missing.  I just want faster code with fewer, shorter
> GC pauses - I have no vested interest in how it happens, but would love to
> contribute a useful tool if that makes sense.
> 
> Regards,
> Nick
> _________________________
>   Racket Developers list:
>   http://lists.racket-lang.org/dev

Posted on the dev mailing list.