[plt-scheme] Fast Byte Vector Traversal (200MB+ file)

From: Robert Winslow (robert.winslow at gmail.com)
Date: Tue Feb 10 06:46:11 EST 2009

I am loving PLT Scheme - its consistent and elegant specification is
relief, after trudging through the behemoth that is Common Lisp. But,
speed is a problem, especially memory traversal. I am writing software
that loops many times over a file's contents in memory. The file will
not change over the lifetime of the program.

I have considered a few solutions to get C-like speed in this project,
while using as much PLT Scheme as possible:
1) Write a simple TCP/IP server to do the iterating over the array of
unsigned chars. MzScheme will send and receive requests from this
daemon program.
2) Do something with C inside of MzScheme, possibly using c-lambda.
How fast can this be?
3) Write a small C library, and use FFI. Will contracts slow it down?
4) Use u8vectors/cvectors/malloc/sequence generators in a way that I
haven't anticipated to get speedy results using only Scheme code.

I'd really rather just use a for/fold loop in Scheme to iterate over
the array, collecting different types of information with each pass.
In my tests, though, the speedup of using C over scheme is at least
10x, if not 100x.

How might I best approach this problem?


Posted on the users mailing list.