[plt-scheme] reading a whole file

From: Robby Findler (robby at cs.uchicago.edu)
Date: Tue Nov 4 15:40:43 EST 2008

Eli made the same point elsewhere today, so I think you can expect
them to be in the next release.

Robby

On Tue, Nov 4, 2008 at 2:36 PM, Ethan Herdrick <info at reatlas.com> wrote:
> Isn't it silly that we all have our own version of this?  Mine has
> some very basic error handling but probably doesn't do the right thing
> about Unicode.  The inverse, something like string->file, is also
> indispensible.  These functions and their like are some of the first
> things you need for hacking useful things up.  Shouldn't they be in a
> SFRI?  Better yet, built in?
>
>
> On Tue, Nov 4, 2008 at 12:03 PM, Stephen De Gabrielle
> <spdegabrielle at gmail.com> wrote:
>> You're right. Even if I partition my data (say 2 gb chunks) I'm probably not
>> that much faster than disk. (based on robby's data)
>> I think I better start reading the ports library docs. (or stick to document
>> sets <100mb)
>>
>> s.
>>
>>
>> On Tue, Nov 4, 2008 at 7:28 PM, Eli Barzilay <eli at barzilay.org> wrote:
>>>
>>> On Nov  4, Stephen De Gabrielle wrote:
>>> > I'm working with the Enron email collection, uncompressed it is 2.54
>>> > Gb(across 500k files) , so it should be possible to play with the
>>> > whole thing in RAM.
>>>
>>> Just in case you plan to actually do that: at these sizes multipler
>>> factors become things that you should be aware of:
>>>
>>> * In general, the GC requires more memory than you actually use.  I
>>>  think that generally speaking you should plan on it holding twice
>>>  the ram that you actually need.  (Even though it can be smaller with
>>>  generations.)
>>>
>>> * MzScheme holds strings in UCS-4 format, so each character is 4
>>>  bytes.
>>>
>>> In other words, you might need around 20gb of ram just to read it all
>>> in.
>>>
>>> --
>>>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>>>                  http://www.barzilay.org/                 Maze is Life!
>>> _________________________________________________
>>>  For list-related administrative tasks:
>>>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>>
>>
>> _________________________________________________
>>  For list-related administrative tasks:
>>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>>
>>
> _________________________________________________
>  For list-related administrative tasks:
>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme
>
>


Posted on the users mailing list.