[racket] Big dictionaries

From: Jos Koot (jos.koot at telefonica.net)
Date: Wed Nov 3 07:53:47 EDT 2010

For very big serializable dictionaries with small keys but memory consuming
values, one might think of a dictionary that holds its values in a direct
accessible file on disk. Within RAM, the dictionary can map the keys to
their values by means of intermediate file-pointers. Does such type of
dictionary already exists? If not and some people are interested, I am very
well willing and prepared to make such a dictionary.
 
I am thinking of the following:
 
Call the dictionary a "da" and its file a "da-file".
 
Opening a da-file opens it for both input and output and reads the mutable
top-da containing an index that maps keys onto file-positions.
 
Closing a da-file stores the mutated top-da, then stores the pointer to the
new top-da in a manner as atomic as possible.
 
A da-file may contain sub-das.
 
When a program/system crashes without closing a da-file, the logical content
of the file will be consistent. In this case new or updated data may be
physically present, but will not exists in logical sense. Opening the
da-file after a crash shows the same dictionary as when last opened. When a
program/system crashes while storing the pointer to the top-da, we may have
a problem. However, it will be possible to reconstruct the most recent
consistent da-file.
 
The dictionary will contain history info such as to allow going back to a
previous state of the da-file. This can be acheived by storing all new data
at the end of the file and storing the pointer to the current top-da in the
first block of the file. Each da will have a pointer to its predecessor.
 
A cleanup-procedure will be present such as to prepare a new da-file whose
physical content corresponds to the logical content according to its current
top-da.
 
Procedures:
 
open-da-file -> da
close-da-file -> da
make-da -> da (for new sub-das)
da-ref
da-set!
da-remove!
da-revert! -> da (go back in history for top or sub-da)
copy-da-file (cleaning up)
da-repair (repairs da-file which is inconsistent because of a failure while
storing the pointer to the top-da while closing the da-file)
 
May be closing das and da-files can be made optional by means of a guard,
but I do not yet fully understand how to do this.
 
Jos
 
 
 
 
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20101103/82056fc3/attachment.html>

Posted on the users mailing list.