[racket] Need some help for my first real experiment with scheme

From: Danny Yoo (dyoo at cs.wpi.edu)
Date: Tue Apr 24 13:29:19 EDT 2012

On Mon, Apr 23, 2012 at 10:18 AM, Pedro <pedro100 at gmail.com> wrote:
> Ok, thank you all  for the input, however I'm still missing an important detail.
> So I build a suffix tree, but how exactly do I refer to the target documents?
> Should I tie a reference to each document in which the string occurs
> to each node? I can't think of other way to do it.

One way to do this is keep a reference to the node at the end of the string.

#lang racket
(require (planet dyoo/suffixtree))

;; String reference
(struct sref (str))

;; string->label*: string -> label
;; Creates a label out of a string, but with a reference
;; to the string at the end.
(define (string->label* s)
   (list->vector (append (string->list s) (list (sref s))))))

Once we have these special characters at the end, we have a way to
cull  them when we search through a tree.  The leaves of our tree will
contain those special sref values, and we can then use them to recover
the matched strings:

;; leaves: node -> (listof node)
;; Get the leaves of the tree rooted at node.
(define (leaves a-node)
  (let loop ([a-node a-node]
             [acc '()])
    (define children (node-children a-node))
    (cond [(eq? children '())
           (cons a-node acc)]
           (foldl loop acc children)])))

;; leaf-str: node -> string
;; Given a leaf node, get back the string stored in the
;; terminating sref element.
(define (leaf-str a-leaf)
  (define a-label (node-up-label a-leaf))
  (sref-str (label-ref a-label (sub1 (label-length a-label)))))

When we walk the tree, if we find a matching node, we then can walk
the children to the leaves.

;; find-matches: tree string -> (listof string)
(define (find-matches a-tree str)
  (define (on-success node up-label-offset)
    (map leaf-str (leaves node)))

  (define (on-fail node up-label-offset input-label-offset)

  (tree-walk a-tree
             (string->label str)

For example:

(define a-tree (make-tree))
(tree-add! a-tree (string->label* "peter"))
(tree-add! a-tree (string->label* "piper"))
(tree-add! a-tree (string->label* "pepper"))

(find-matches a-tree "er")
(find-matches a-tree "per")

Note that find-matches isn't checking for duplicates, so if we search
for the single string "e", we'll see several matches because there are
several suffixes of the keywords that contain "e".

Posted on the users mailing list.