[plt-scheme] asking mzscheme and DrScheme to GC less to get speedups at the expense of memory usage?

From: Lee Spector (lspector at hampshire.edu)
Date: Wed Dec 30 15:12:29 EST 2009

Hmm -- I guess maybe I'm barking up the wrong tree because a short test using (time ...) produces:

cpu time: 87094 real time: 89161 gc time: 5547

which I don't really understand, but it doesn't look like it indicates a lot of time spent in gc. 

BTW I tried the wasted-memory-to-trick-GC trick in DrScheme and it had no noticeable effect on the rate of recycle-icon flashing.

I also tried mzscheme -W debug and got a profusion of output that generally looked something like this:

GC [minor] at 45471408 bytes; 15773168 collected in 1 msec
GC [minor] at 45475780 bytes; 15770572 collected in 1 msec
GC [minor] at 45482604 bytes; 15772428 collected in 2 msec
GC [major] at 45487612 bytes; 15929868 collected in 151 msec
GC [minor] at 45285964 bytes; 15724460 collected in 1 msec

I guess unless I'm missing something I'm not going to get a substantial speedup by reconfiguring GC. If anyone sees something I'm missing I'd love to hear about it.

And in any event, thanks Robbie and Matthew.

 -Lee


On Dec 30, 2009, at 2:52 PM, Robby Findler wrote:

> This is perhaps obvious, but if you wrap (time ...) around the entry
> point, you'll get back times that include gc time and that might also
> tell us if it garbage collecting a lot. Or, if you are running from
> mzscheme, you can run like this:
> 
> mzscheme -W debug
> 
> and you'll see when garbage collections happen.
> 
> Robby
> 
> On Wed, Dec 30, 2009 at 12:06 PM, Lee Spector <lspector at hampshire.edu> wrote:
>> 
>> Under DrScheme the green recycle icon flashes about 3 or 4 times per second.
>> 
>> If I understand correctly I should add something like this to my code:
>> 
>> (define wasted-memory-to-trick-GC
>>  (make-string 100000000 #\x))
>> 
>> I've just tried that for a new Linux/mzscheme run, but it will take some time to tell if it's having an effect. I don't see anything dramatic, however. I'll try in DrScheme, where I may notice different behavior of the recycle icon, when my current run completes (which may be a day or so).
>> 
>> Thanks,
>> 
>>  -Lee
>> 
>> On Dec 30, 2009, at 12:36 PM, Matthew Flatt wrote:
>> 
>>> There's currently no such control, but
>>> 
>>> * Do you know how much time is being spent by the GC? When you run in
>>>   DrScheme, does it spend a lot of time with the green recycle icon
>>>   on or flashing very quickly?
>>> 
>>> * You can partially simulate a request to use more memory by
>>>   allocating a big byte string and holding onto it. A GC is triggered
>>>   based on current memory use versus the memory use after the most
>>>   recent garbage collection. So, if you hold onto a bunch of data,
>>>   more data will be used before another GC. (Be sure to allocate a big
>>>   byte string or character string, and not a big vector, because you
>>>   don't want the GC to have to traverse the big object.)
>>> 
>>> 
>>> At Wed, 30 Dec 2009 12:20:18 -0500, Lee Spector wrote:
>>>> 
>>>> I have some memory-hungry and compute-intensive programs that I'm running in
>>>> DrScheme under Mac OS X and in mzscheme under Linux. Under DrScheme I set the
>>>> memory limit to something fairly high, while under Linux, if I understand
>>>> correctly, there is no limit.
>>>> 
>>>> In both cases, however, less memory is actually being used than I would
>>>> expect. I suppose this means that a lot of the memory turns quickly to garbage
>>>> and is being quickly collected, which is nice in some respects, but in the
>>>> current case I want maximum execution speed and would be happy for the thing
>>>> to eat several GB more memory if that would help things to run faster.
>>>> 
>>>> The reason I suspect this might be possible is that I used to experience
>>>> similar things in various Lisps, and I could get substantial speedups by
>>>> telling them to GC less. For example, in CMUCL the default GC parameters would
>>>> always cause my programs (which are often genetic programming systems with
>>>> large populations) to thrash in GC and run very slowly overall, but if I
>>>> launched CMUCL with a command-line argument that told it not to GC until a
>>>> particular (high) threshold of allocation was reached then it would run much
>>>> faster. This makes a big difference for runs that can take hours or days.
>>>> 
>>>> Is there a way to do something similar with mzscheme and/or DrScheme? Or would
>>>> it not help for some reason? I'm currently conducting a run that I would
>>>> expect to eat lots of RAM and it's using a measly 1.9% of my system memory.
>>>> I'd be happy for it to use 99% if that would improve the runtime.
>>>> 
>>>> I've browsed the reference and found a number of sections related to GC but
>>>> nothing addressing this issue specifically.
>>>> 
>>>> Thanks for any help you can provide,
>>>> 
>>>> -Lee
>>>> 
>>>> --
>>>> Lee Spector, Professor of Computer Science
>>>> School of Cognitive Science, Hampshire College
>>>> 893 West Street, Amherst, MA 01002-3359
>>>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>>>> Phone: 413-559-5352, Fax: 413-559-5438
>>>> 
>>>> Check out Genetic Programming and Evolvable Machines:
>>>> http://www.springer.com/10710 - http://gpemjournal.blogspot.com/
>>>> 
>>>> _________________________________________________
>>>>  For list-related administrative tasks:
>>>>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>> 
>> --
>> Lee Spector, Professor of Computer Science
>> School of Cognitive Science, Hampshire College
>> 893 West Street, Amherst, MA 01002-3359
>> lspector at hampshire.edu, http://hampshire.edu/lspector/
>> Phone: 413-559-5352, Fax: 413-559-5438
>> 
>> Check out Genetic Programming and Evolvable Machines:
>> http://www.springer.com/10710 - http://gpemjournal.blogspot.com/
>> 
>> _________________________________________________
>>  For list-related administrative tasks:
>>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>> 

--
Lee Spector, Professor of Computer Science
School of Cognitive Science, Hampshire College
893 West Street, Amherst, MA 01002-3359
lspector at hampshire.edu, http://hampshire.edu/lspector/
Phone: 413-559-5352, Fax: 413-559-5438

Check out Genetic Programming and Evolvable Machines:
http://www.springer.com/10710 - http://gpemjournal.blogspot.com/



Posted on the users mailing list.