[plt-scheme] minimizing garbage collection time

From: Adam Wick (awick at cs.utah.edu)
Date: Sun May 30 18:11:54 EDT 2004

At Sun, 30 May 2004 14:17:51 -0400, Doug Orleans wrote:
> cpu time: 19300 real time: 19386 gc time: 18270
> cpu time: 58860 real time: 60695 gc time: 53500
> 
> If I'm reading this right, this means that if I could turn off the
> garbage collector, these would only take 1 and 5 seconds,
> respectively.  Is that right? 

Rather, if you had an infinitely fast garbage collector. If you
weren't to run the garbage collector at all, then none of your
dead memory would be reclaimed. If you don't reclaim memory
your program hits swap space pretty fast (at which point the
program will grind until the OS kills it).

As I recall, if you invoke drscheme with the null collector -- a
collector that never actually reclaims any space -- drscheme 
doesn't make it through startup before the OS kills it. But it's
been a long time since I tried that little experiment, so I may
be remembering incorrectly.

> Is it abnormal for garbage collection to take 90-95% of the time?

Quite. Could you send me the program? It'd be interesting (to me,
at least) to see what's going on.

The only reason I can see such happening is if you're going into swap
pretty heavily and your timings are including the time spent swapping.

> What triggers the garbage collector?  Is it whenever the allocation
> pool is empty? 

I know this is the case with the 3m collectors. I'm pretty sure it's
also true of the Boehm collector.

> I have a lot of weak hash tables, but maybe I have some non-weak
> tables that should also be weak-- would holding onto too many objects
> that should otherwise be garbage cause the collector to do a lot of
> extra traversals that it shouldn't need to do?

Yes.

> Any other tips for reducing GC time (or even just measuring it better)
> would be appreciated.

Generally, the less you allocate, the less you collect and the less
your GC time. However, the additional time you spend rewriting all
your code to allocate less (and then debugging the result) may not
be worth any speedups you get. Personally, I usually only find it 
worth the effort in egregious examples involving *very* long lists.

Note that in MzScheme, the call stack is an allocated object. 


-Adam


Posted on the users mailing list.