[plt-scheme] How to handle large data

From: Eli Barzilay (eli at barzilay.org)
Date: Thu Dec 18 04:29:25 EST 2008

On Dec 18, Noel Welsh wrote:
> On Thu, Dec 18, 2008 at 8:34 AM, Kenichi Asai <asai at is.ocha.ac.jp> wrote:
> > I can open this file, but when I try to execute it, DrScheme says "out
> > of memory" and when I switch off the memory limit, it runs forever.
> >
> > Is there any good way to load this file?
> 
> In MzScheme (NOT DrScheme) it loads without problem:
> 
> Welcome to MzScheme v4.1.1.3 [3m], Copyright (c) 2004-2008 PLT Scheme Inc.
> > (load "postal.ss")
> > (length postal-list)
> 121793
> 
> DrScheme tracks locations and does other stuff that improves
> debugging but decreases performance.  I believe you can turn some of
> this stuff off (and turn on the JIT).  Perhaps that will help.

It will likely make DrScheme's overhead much smaller if you make it
all be just a big piece of quoted data:

  (define-struct postal (code address))
  (define postal-list-data
    '((1706090 "address in Japanese")
      (0708071 "address in Japanese")
      ...)
  )

  (define (postal-data->postal entry)
    (make-postal (first entry) (second entry)))
  (define postal-list
    (map postal-data->postal postal-list-data))

You can eliminate the overhead completely by putting the data in a
file:

  ((1706090 "address in Japanese")
   (0708071 "address in Japanese")
   ...)

and then load it with:

  (define postal-list
    (map postal-data->postal
         (call-with-input-file "postal-data" read)))

But this requires things that are probably not available in the
teaching languages.  The obvious solution is to make it into a
teachpack, which in this case will be a simple module:

  #lang scheme
  (provide read-file)
  (define (read-file file)
    (call-with-input-file file read))

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the users mailing list.