[racket-dev] internal-definition parsing

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Wed Jul 7 12:23:19 EDT 2010

Short version:

I'm planning to change internal-definition expansion (anywhere that
says `body ...' in the Racket documentation) to allow expressions to
mingle with definitions. For example,

  (let ()
    (define (f) x)
    (displayln f)
    (define x 1)
    (list f x))

would be allowed; the `displayln' call would print `#<procedure:f>',
and the result would be a list containing the function `f' and the
number 1.

Some other changes related to internal-definitions have been suggested,
but I don't plan to implement them for now.

========================================

Long version:

Mixing Expressions and Definitions
----------------------------------

Long ago, internal-definition positions in MzScheme allowed multiple
sets of definitions separated by expressions. For example,

 (let ()
   (define x 1)
   (display x)
   (define y 2)
   (display y))

was allowed. In that old mode, the definition of `y' started a new
group of definitions, so the right-hand side of `(define x ...)' could
not refer to `y'. In other words, the above was equivalent to

 (letrec ([x 1])
   (display x)
   (letrec ([y 2])
     (display y)))

I think this behavior mimicked Chez Scheme at the time, but I may be
mistaken. For whatever reason (I don't remember), we changed
internal-definition positions to now allow additional definitions after
an expression. Maybe it was to more closely match R5RS.


Meanwhile, `unit', `module', `class' and other forms evolved to allow
expressions mixed with definitions. Probably as a result, many have
suggested that internal definitions similarly allow expressions mixed
with definitions --- without the old grouping. In that case, the `let'
above would be equivalent to

 (letrec-values ([(x) 1]
                 [() (begin (display x) (values))]
                 [(y) 2])
   (display y))

This change seems like a good idea. Now that I've finally gotten around
to trying it out, I think we should go with it immediately.


Should an expression be required at the end? A `module', `unit', or
`class' body can consist of just definitions. Similarly, if an
internal-definition context ends with a definition, we could define the
result to be `(void)', which is what the `block' form does.

I think it's better to require an expression at the end, partly on the
grounds that the internal-definition block is supposed to return a
value (unlike the body of `module', etc.) and partly to avoid making
forms like

 (define (f x)
   x
 (define (g y)
   y))

legal when they are almost certainly mistakes.

This change could be implemented in new `lambda', etc. bindings, but I
think existing forms in `racket' should change. Furthermore, to keep
things simple, the existing `scheme' forms --- most of which are the
same binding as the `racket' forms --- should also change.

The change should not affect the `r5rs' language, and we should change
the `r6rs' language so that it doesn't inherit the more liberal
handling of internal-definition forms.

There are no issues with backward-compatibility for existing `scheme'
and `racket' modules, as far as I can tell. The change would just
accept more modules.


More Internal-Definition Contexts
---------------------------------

Internal definitions could be allowed in more places, such as

 (define f
   (define x 2)
   x)

In principle, where a single expression is expected, definitions could
be allowed to precede the expression.

It's a tempting generalization, but probably too confusing. I think
it's better to use some form that groups definitions with an
expression: `(let () ....)', `(block ....)', or something like that.


An Implicit Internal-Definitions Form
-------------------------------------

Many forms --- including `lambda' `let', and the function-shorthand
variant of `define' --- support internal definitions. The handling of
internal definitions is currently tired to the form.

An alternative would be to have those forms implicitly wrap a group of
internal-definition forms with `#%body', in much the same way that an
application form is converted to a use of `#%app'. Then, the treatment
of internal definitions could be independently configured through a
module import. For example, one module might use a `#%body' that
corresponds to the old handling of internal definitions, while another
module could import a `#%body' that corresponds to the new handling.

Setting aside the backward-compatibility issues of adding an implicit
form, I think it turns out not to work as well as `#%app'. The problem
is that there's not a good place to get the lexical context to use for
the implicit `#%body'. We have many macros analogous to

 (define-syntax-rule (squawk body ...)
   (begin
     (displayln "squawk!")
     (let () body ...)))

If you take the lexical context from the parentheses that surround the
`let' form, then the `#%body' is drawn from the context of the `squawk'
definition, while the intent was more likely to get it from the use of
`squawk'. Meanwhile, the `body ...' sequence itself doesn't necessarily
have a lexical context (since it doesn't have a parenthesis, roughly),
and the sequence is recreated by the macro implementation anyway.

This problem happens occasionally with `#%app'. When a use of
identifier macro expands in an application position, for example, the
implementor of the identifier macro usually should copy the lexical
context of the old application form to the expansion result. Such cases
are more rare than examples like `squawk', though.

So, an implicit `#%body' form seems like a good idea in principle, but
it doesn't seem to work out well with our current syntax. I'm currently
inclined to not change Racket and to treat this as something to support
better the next time we define a core syntax, but I'm interested in
further discussion.



Posted on the dev mailing list.