[racket-dev] Uninterned symbols in compiled code

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Fri Jul 6 14:59:34 EDT 2012

I'll push improvements to the cross-reference and explanation in the docs.

As I try to make an example illustrating problems, I see that Racket is
more resistant to problems created by `gensym' than I expected. In
principle, the following is a good example:

 a.rkt:
 ------
 #lang racket
 (define-syntax (define-and-provide stx)
   (syntax-case stx ()
     [(_ id)
      (with-syntax ([ga (gensym 'a)])
        #'(begin
            (define ga (random))
            (define-syntax-rule (id) ga)
            (provide id)))]))
 (define-and-provide a)

 b.rkt:
 ------
 #lang racket/base
 (require "a.rkt")
 (a)

After `raco make b.rkt', the compiled form of "b.rkt" will refer to a
variable whose name is a gensym, and it will be a different gensym than
the one in "a.rkt" when the latter's code is loaded. The compiled code,
however, tracks both the name and relative (to other exports) position
for each variable reference; the linker is happy enough that the
printed forms of the names match up at the expected export position ---
which is a leftover from the days when `generate-temporaries' was
implemented with `gensym'.

If you throw away the bytecode for "a.rkt" and try to load "b.rkt",
then you'll most likely get an error, and that error won't happen if
you use `generate-temporaries'. In other words, the real difference has
to do with determinism of expansion (as long as everything else in
expansion is deterministic).

At Fri, 6 Jul 2012 13:12:04 -0400, Carl Eastlund wrote:
> The documentation for generate-temporaries[1] ends with, "The
> generated identifiers are built with interned symbols (not gensyms),
> so the limitations described with current-compile do not apply."
> However, I cannot find any limitations described in the documentation
> for current-compile[2].  If I follow a link there to documentation on
> printing compiled code[3], I find the following paragraph:
> 
> "A compiled-form object may contain uninterned symbols (see Symbols)
> that were created by gensym or string->uninterned-symbol. When the
> compiled object is read via #~, each uninterned symbol in the original
> form is mapped to a new uninterned symbol, where multiple instances of
> a single symbol are consistently mapped to the same new symbol. The
> original and new symbols have the same printed representation.
> Unreadable symbols, which are typically generated indirectly during
> expansion and compilation, are saved and restored consistently through
> #~."
> 
> I'm unsure whether this corresponds to the limitations that
> generate-temporaries refers to.  This certainly looks to me like it
> does "the right thing" with uninterned symbols in compiled code, but
> I'm probably ignoring an important subtlety.  Can someone please
> clarify this issue, here and/or in the documentation?  An example of
> the kind of problem that arises when using gensym in compiled code
> would be wonderful.
> 
> Carl Eastlund
> 
> [1] 
> http://docs.racket-lang.org/reference/stxops.html#%28def._%28%28lib._racket/priv
> ate/stxcase-scheme..rkt%29._generate-temporaries%29%29
> 
> [2] 
> http://docs.racket-lang.org/reference/eval.html#%28def._%28%28quote._~23~25kerne
> l%29._current-compile%29%29
> 
> [3] 
> http://docs.racket-lang.org/reference/printing.html#%28part._print-compiled%29
> _________________________
>   Racket Developers list:
>   http://lists.racket-lang.org/dev

Posted on the dev mailing list.