[plt-scheme] reading a whole file

From: Ethan Herdrick (info at reatlas.com)
Date: Tue Nov 4 15:36:16 EST 2008

Isn't it silly that we all have our own version of this?  Mine has
some very basic error handling but probably doesn't do the right thing
about Unicode.  The inverse, something like string->file, is also
indispensible.  These functions and their like are some of the first
things you need for hacking useful things up.  Shouldn't they be in a
SFRI?  Better yet, built in?


On Tue, Nov 4, 2008 at 12:03 PM, Stephen De Gabrielle
<spdegabrielle at gmail.com> wrote:
> You're right. Even if I partition my data (say 2 gb chunks) I'm probably not
> that much faster than disk. (based on robby's data)
> I think I better start reading the ports library docs. (or stick to document
> sets <100mb)
>
> s.
>
>
> On Tue, Nov 4, 2008 at 7:28 PM, Eli Barzilay <eli at barzilay.org> wrote:
>>
>> On Nov  4, Stephen De Gabrielle wrote:
>> > I'm working with the Enron email collection, uncompressed it is 2.54
>> > Gb(across 500k files) , so it should be possible to play with the
>> > whole thing in RAM.
>>
>> Just in case you plan to actually do that: at these sizes multipler
>> factors become things that you should be aware of:
>>
>> * In general, the GC requires more memory than you actually use.  I
>>  think that generally speaking you should plan on it holding twice
>>  the ram that you actually need.  (Even though it can be smaller with
>>  generations.)
>>
>> * MzScheme holds strings in UCS-4 format, so each character is 4
>>  bytes.
>>
>> In other words, you might need around 20gb of ram just to read it all
>> in.
>>
>> --
>>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>>                  http://www.barzilay.org/                 Maze is Life!
>> _________________________________________________
>>  For list-related administrative tasks:
>>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
>
> _________________________________________________
>  For list-related administrative tasks:
>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
>


Posted on the users mailing list.