[racket] #<undefined> and backward compatibility

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Fri Apr 18 09:54:35 EDT 2014

This message is about an experiment that would improve Racket but
introduce a backward incompatibility. We'd like more information about
how the change affects your code (see questions at the end).

Undefined
---------

If you've programmed in Racket enough, and especially if you've ever
converted a `let*` into a sequence of internal `define`s, then you've
encountered an error along the lines of

  car: contract violation
    expected: pair?
    given: #<undefined>

You probably said to yourself, "I didn't expect the #<undefined>
value."

No one expects the #<undefined> value!

Sometimes, it's much worse than a late contract failure. An unexpected
#<undefined> caused a race condition in `delay/thread' that, in turn,
crashed Typed Racket tests a while back. It took a long time to track
down the problem. More recently, Jay tracked down a bug where a hash
table was populated using references to not-yet-initialized variables
as keys, so that #<undefined> was used as the key in all cases, and
later hash-table lookups produced the wrong result. The consequence was
footer links that mysteriously went to the wrong web page.

What To Do About It
-------------------

Since no one expects the #<undefined> value, it would be better if an
expression that produces #<undefined> raised an exception, instead.

In the development repo this week, we're experimenting with that change
--- where "we" is mostly Claire Alvis, who did the hard work of
adjusting the Racket compiler to determine where use-before-definition
checks are needed.

Here's the canonical example:

 Welcome to Racket v6.0.1.4.
 > (letrec ([x x]) x)
 x: undefined;
  cannot use before initialization

Similarly, it's an error to access a field of an object before the
field's definition:

 Welcome to Racket v6.0.1.4.
 > (new (class object% (define x x)))
 x: undefined;
  cannot use field before initialization

Note that that it has always been an error to reference a module or
top-level variable before its definition. That is,

 #lang racket/base
 (define x x)

has always been an error, and the experimental change makes local
bindings more like module-level bindings.

Backward-Compatibility
----------------------

The change is an obvious improvement to the language, but if we decide
to stick with it, it's also a backward-incompatible change:

 * Some programs use `(letrec ([x x]) x)` specifically to get
   #<undefined>, either to use as an initial value or to detect
   undefined values. That code will fail, since `(letrec ([x x]) x)`
   changes to raise an exception.

   This problem usually happens with a language implementation or an
   especially complex syntactic form.

   For now, we've added a `racket/undefined` library to provide a
   `undefined` binding. The idea is to include it in a release both
   before and after the change, so that the transition path is a little
   smoother. That is, a library can use `(require racket/undefined)`
   instead of `(define undefined (letrec ([x x]) x))`.

 * Sometimes, a class references a field `x` before `x` is defined. The
   field might be protected by a guard like `(string? x)` or `(is-a? x
   color%)`. This usually happens when a method is called during object
   initialization and it references a field that will be initialized
   later, possibly after `super-new`.

   We ran into use-before-definition for fields several times in the
   implementation of `racket/gui` platform-specific back-ends and in
   classes that are specific to DrRacket's implementation. In most
   cases, guards seem to prevent buggy behavior, but probably not in at
   least one case. In many cases, it looks like the guard just happens
   to work, as opposed to being designed to handle #<undefined>.

   The pattern of overrides and callbacks in the GUI back-end and
   DrRacket classes is relatively complex, and so we're not sure how
   common this problem will be in other code. We were happy to fix
   these cases in our code.

 * In a couple of cases, a variable was referenced too early, but the
   code was either dead or insufficiently tested, so the problem wasn't
   detected before.

   We were happy to discover the error and fix the code, although the
   code "worked" before.

That's the chief two... er... three kinds of incompatibility. Also, at
least one of my slide decks would become outdated, along with section
9.3 of PLAI2e.

Should We Do This?
------------------

Based on our experiment so far, it looks like the drawbacks probably
outweigh the benefits, even though we're normally reluctant to
introduce backward incompatibilities. (We would not adopt this change
for the upcoming v6.0.1 release, but maybe in the one after that.)

To make a decision, we need more input:

 * Does the change affect your programs?

   You can try a development snapshot from either of the sites listed
   here:

       http://pre.racket-lang.org/

 * Is this kind of backward incompatibility ok?

   We'll base a decision on how the experiment turns out, but
   especially if the experiment goes well, a clear mandate from the
   Racket community would seal the deal.

Thanks!


Posted on the users mailing list.