[plt-scheme] Runtime accounting for unbound identifiers

From: Ryan Culpepper (ryanc at ccs.neu.edu)
Date: Wed Feb 10 00:20:43 EST 2010

Synx wrote:
> I'm in a situation where I want to evaluate code such as '(foo (bar
> baz)) but foo, bar and baz may not be defined in the context of
> evaluation. I will have a procedure somewhat like (find-identifier name)
> that will (slowly) find an identifier by its name, whether foo, bar,
> baz, or any of a hundred other identifiers I might have stored away
> somewhere. But I'm not sure how to use find-identifier to make sure all
> the identifiers specified in an arbitrary s-expression like '(foo (bar
> baz)) are present and accounted for.
> I don't want to find and re-evaluate all pieces of code in my database
> every time I evaluate any s-expression. What I want is to get a list of
> identifiers that need to be found, search for them, and only error out
> if none of those identifiers are found anywhere. That's what I think I
> absolutely need, is some way of analyzing code that will let me specify
> which identifiers it depends on, that aren't satisfied by some global
> already defined namespace.
> It might also be nice to record where these runtime identifiers were and
> associate that with some compiled representation of aforementioned code,
> so that the next time it's evaluated my program won't have to search for
> the identifiers by name, recursively.

Unbound variables are handled by the #%top macro. Here's a tiny program 
that redirects all unbound variables to the lookup-symbol function:

   #lang scheme/base
   (provide #%top)

   ;; Shadows #%top from scheme/base
   (define-syntax-rule (#%top . x)
     (lookup-symbol 'x))

   ;; lookup-symbol : symbol -> any
   (define (lookup-symbol sym)
     (printf "getting the value of ~s\n" sym)

   (+ x y)

Running it produces

   getting the value of x
   getting the value of y

To get the list of variables you could expand the code and look for 
applications of the lookup-symbol procedure, but traversing expanded 
code is unpleasant.

Another way to do it is to get your new #%top macro to record variables 
as it sees them and then attach them to the expanded code. Here's an 
updated version of the program:

   #lang scheme/base
   (require (for-syntax scheme/base))

   ;; collected-vars : (parameterof (listof symbol))
   (define-for-syntax collected-vars (make-parameter null))

   ;; (with-collector expr)
   ;; behaves like 'expr', but the unbound variables within 'expr'
   ;; are attached to the expansion using the 'variables property.
   (define-syntax (with-collector stx)
     (syntax-case stx ()
       [(_ e)
        (parameterize ((collected-vars null))
          (let ([ee (local-expand #'e (syntax-local-context) null)])
            (syntax-property ee 'variables (collected-vars))))]))

   ;; (#%top . identifier)
   ;; Record the variable name and defer to lookup-symbol.
   (define-syntax (#%top stx)
     (syntax-case stx ()
       [(_ . x)
        (begin (collected-vars
                (cons (syntax->datum #'x) (collected-vars)))
               #'(lookup-symbol 'x))]))

   (define (lookup-symbol sym)
     (printf "getting the value of ~s\n" sym)

   ;; expand/vars : syntax -> (values syntax (listof symbol))
   ;; Expands the term, returning both the expanded syntax
   ;; and the list of unbound variables it refers to.
   (define (expand/vars stx)
     (let ([ee (expand #`(with-collector #,stx))])
       (values ee (syntax-property ee 'variables))))

If you just eval an expression like (+ x y), you get the same behavior 
as before. But you can also use expand/vars:

   > (define-values (expanded vars) (expand/vars #'(+ x y)))
   > expanded
   #<syntax:7:48 (#%app + (#%app lookup-symbol...>
   > vars
   (y x)
   > (eval expanded)
   getting the value of x
   getting the value of y

(You need to call expand/vars either in the interactions window or in a 
namespace that you've set up to have the right binding for #%top.)

I hope that gives you some ideas to start with.


Posted on the users mailing list.