[plt-scheme] reading a whole file
You're right. Even if I partition my data (say 2 gb chunks) I'm probably not
that much faster than disk. (based on robby's data)
I think I better start reading the ports library docs. (or stick to document
sets <100mb)
s.
On Tue, Nov 4, 2008 at 7:28 PM, Eli Barzilay <eli at barzilay.org> wrote:
> On Nov 4, Stephen De Gabrielle wrote:
> > I'm working with the Enron email collection, uncompressed it is 2.54
> > Gb(across 500k files) , so it should be possible to play with the
> > whole thing in RAM.
>
> Just in case you plan to actually do that: at these sizes multipler
> factors become things that you should be aware of:
>
> * In general, the GC requires more memory than you actually use. I
> think that generally speaking you should plan on it holding twice
> the ram that you actually need. (Even though it can be smaller with
> generations.)
>
> * MzScheme holds strings in UCS-4 format, so each character is 4
> bytes.
>
> In other words, you might need around 20gb of ram just to read it all
> in.
>
> --
> ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
> http://www.barzilay.org/ Maze is Life!
> _________________________________________________
> For list-related administrative tasks:
> http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20081104/afbf46cd/attachment.html>