[plt-scheme] Compression dictionary

From: Eli Barzilay (eli at barzilay.org)
Date: Mon Oct 5 12:24:17 EDT 2009

Does anyone know of a good and simple method for building a dictionary
for compression?


[

Explanation: The documentation index file was ~2mb initially, and now
it's up to 3mb.  In addition, some thing I did for compression make
loading it slower (like nested arrays which are used like Sexprs) so
I'm revising the whole thing.

To get better compression, I started hand-crafting regexps, but you
can imagine the kind of mess that this leads to... so I'm looking for
a more proper compression.  To that end, I even found a very short
Javascript LZW encoder and decoder, but the problem is that LZW works
better with a lot of text but I can't afford having the whole thing as
one giant string (since the time to decode it will be even longer).

Instead, I'm looking for a way to find a good set of common substrings
from all index entries, then go over the data and replace these
strings by pointers to an array holding the dictionary.  I'll be happy
to hear on better methods too.

]

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!


Posted on the users mailing list.