Fwd: [plt-scheme] In Search of Equitable Temporal Assignment of GC time in Benchmarking

From: William D Clinger (will at ccs.neu.edu)
Date: Fri Jun 29 06:12:22 EDT 2007

Kyle Smith wrote:
> With the adoption of generational GC's, the non-trivial problem of  
> distributing GC time in benchmarking results is less of a factor.   
> However, there remain Scheme implementations that have not yet  
> implemented modern garbage collection algorithms, and that present  
> a non-trivial problem for benchmarking comparative function  
> execution time.

Benchmarking of garbage collectors is very difficult
because there are so many things that should be held
constant, but the collectors you want to benchmark
are so different that they cannot be controlled in
ways that would maintain those constants.

> The problem becomes quite significant when as the  
> percentage of reachable data in the heap increases, while the  
> available memory decreases.

We call this the load factor (reachable data divided
by total memory in use) or inverse load factor (total
memory in use divided by reachable data).  Just about
any collector will be reasonably efficient (in an
amortized sense) at low load factors, so you're mainly
interested in the high load factors where benchmarking
is most difficult.

> I make these statements based on my own experience with attempts at  
> benchmarking two different implementations that suffer this  
> problem.  In order to produce credible results, I need to account  
> for the GC time in an equitable fashion, that would ideally  
> distribute the GC in a temporal manner over real time of  
> execution.  After an ACM search, I didn't find anything that spoke  
> to this problem specifically,  Most of the literature,  
> understandably, is focused on coming up with better algorithms.

There are two distinct things you might be talking about
here.  The simpler thing is that you might be interested
in average amortized performance, in which case the usual
approach is to repeat the benchmark hundreds of times in
the same execution and take the average.

On the other hand, you might be looking for some measure
of the collector's ability to distribute gc time over
execution well enough to avoid impacting the mutator's
responsiveness.  A search at Google Scholar on "garbage"
and "minimum mutator utilization" will get you started
on that topic.

Will


Posted on the users mailing list.