[plt-scheme] reading a whole file
On Nov 4, Stephen De Gabrielle wrote:
> I'm working with the Enron email collection, uncompressed it is 2.54
> Gb(across 500k files) , so it should be possible to play with the
> whole thing in RAM.
Just in case you plan to actually do that: at these sizes multipler
factors become things that you should be aware of:
* In general, the GC requires more memory than you actually use. I
think that generally speaking you should plan on it holding twice
the ram that you actually need. (Even though it can be smaller with
generations.)
* MzScheme holds strings in UCS-4 format, so each character is 4
bytes.
In other words, you might need around 20gb of ram just to read it all
in.
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://www.barzilay.org/ Maze is Life!