[plt-scheme] How to handle large data

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Thu Dec 18 12:11:56 EST 2008

Kenichi, good to see your teaching your students stress-testing. Eli  
gave you the exact right answer, so let's spell it out.

While I personally keep data in this form for scripts (checks,  
stocks, grades, ...) and edit it in Emacs when needed, I do not  
recommend teaching it to freshmen students as the primary mechanism.  
Just as I hate it when they come out of a freshmen course thinking  
"data is text" (think Perl/Python), I would want students to come out  
of a Scheme-using course thinking you create data with (define  
data ...) as convenient as this is for scripting.

Instead, I recommend you show them an S-expression formatted data  
file, like Eli did, and use a two-line teachpack. Alternatively, you  
use a comma-separated style (sign up with plt-edu!) and use Shriram's  
teachpack (which just uses cvs.plt from PLANET) to read in  
'spreadsheet style' data. A third alternative is to use XML and use  
read-xml, then explain that XML is new fandangled S-expressions  
syntax because people can't cope with "(" and ")". This way students  
learn something general that they can take away for other languages.

-- Matthias









On Dec 18, 2008, at 3:34 AM, Kenichi Asai wrote:

> Hello everyone,
>
> I want to load a large file into DrScheme interactive environment.  To
> be more specific, I want to use the following file containing all the
> postal data in Japan temporarily available at:
>
> http://pllab.is.ocha.ac.jp/~asai/postal.ss
>
> The file looks like:
>
> (define-struct postal (code address))
>
> (define postal-list (list
> (make-postal 1706090 "address in Japanese")
> (make-postal 0708071 "address in Japanese")
> ...
> ))
>
> but the list is very long, about 120000 lines.
>
> I can open this file, but when I try to execute it, DrScheme says "out
> of memory" and when I switch off the memory limit, it runs forever.
>
> Is there any good way to load this file?
>
> I am going to teach the basic concepts in programming to high school
> students the day after tomorrow, and want to use PLT Scheme GUI for  
> it.
> I plan to apply the linear search to the above postal data.
>
> I tried (load "postal.ss"), but DrScheme says:
>
>   reference to undefined identifier: load
>
> Is this because I am using the language "beginning student"?
> (I want to stay at "beginning student" if possible.)
>
> Thank you in advance.
>
> Sincerely,
>
> -- 
> Kenichi Asai
> _________________________________________________
>   For list-related administrative tasks:
>   http://list.cs.brown.edu/mailman/listinfo/plt-scheme



Posted on the users mailing list.