[racket] Racket Virtual Machine runs out of memory

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Fri Jul 20 15:07:30 EDT 2012

The second list you send out is close to 500Mb if I understand this correctly, 1/6th of your memory not counting other things you're running. That's up from 1/20th for the first list, which is large but not killing memory-large. 

As Eli says, when you start dealing with such lists (1/6th) and you process them, you should definitely reconsider your data representation. As every responder has suggested, and I am adding myself now that I understand the full extent of your problem, pack the booleans into ints and work on them. You will benefit a LOT. It is possible that Magic must first allocate a large list but as soon as you have it, "kill" the list with a better data representation: 

 (define my-image (reformulate-booleans-as-ints (Magic-call-out)))
The nesting ensures that the gc sees the intermediate list as garbage. You could call collect-garbage after this definition to get the memory back immediately. 

On Jul 20, 2012, at 2:21 PM, Harry Spier wrote:

> Thanks Eli,
> I probably have to go the route you and others suggest.  But I think I
> still have a problem.  Even the single operation of MagickExportPIxels
> to export the pixel data of this page to manipulate fills at least 4/7
> of the memory before failure.  And there is no guarantee that pages
> wont have more pixel data.  They probably will.  And as Matthias says
> the list doesn't seem "that large".
> Also I have 4 or 5 transformation stages of the pixel data.  Will
> putting in Garbage Collection commands to get rid of transformations
> of the data I've already used help?
> What sets the amount of memory the Racket virtual machine uses.  Is it
> a Racket parameter, Is it a function of the amount of RAM in the
> machine?  Is this more a Windows problem and will switching to Linux
> help etc.?
> Harry Spier
> On Fri, Jul 20, 2012 at 1:09 PM, Eli Barzilay <eli at barzilay.org> wrote:
>> 10 minutes ago, Harry Spier wrote:
>>> #lang racket
>>> (define l (time (build-list (* 7091 5023) (λ (x) 1))))
>>> (system "PAUSE")
>>> ABORTS with Racket Virtual Machine run out of memory
>> IME, the exact size where things fail is not important -- if you're
>> getting anywhere close to it, then you should revise the code to use
>> less memory.
>> There was the option that was raised for using integers, which might
>> be inconvenient -- even with one (huge) integer for each row.
>> Instead, I think that it would be convenient to use one big byte
>> string for the whole array, and write some accessor functions to
>> address the contents as a matrix.  The exact format of the byte string
>> can be one byte per 1/0 pixel or even more compactly, one bit per
>> pixel.  The choice should depend on whatever libraries you're using
>> with the same data, to minimize translation work.  (Dealing with bits
>> will make the access code a bit more complicated.)
>> Currently, you're using one cons cell for each number, which is
>> probably somewhere around 3 pointers -- which is about 12 bytes per
>> bit.  So just a one byte for each number would be a 12x factor, with
>> one bit per 1/0, you're at ~100x saving, which would be significant
>> enough to reduce other processing times.
>> --
>>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>>                    http://barzilay.org/                   Maze is Life!

Posted on the users mailing list.