[plt-dev] some Racket proposals & implementation

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Fri Apr 2 17:30:16 EDT 2010

Version 4.2.5.5 in the SVN trunk includes experimental features to
support the following proposed Racket features. You can try the
proposals with `#lang racket' in MzScheme.

Structure Constructor Names
---------------------------

Proposal: The default constructor name bound by `define-struct' in
Racket should be the same as the type name, instead of having a `make-'
prefix.

Example:

     > (define-struct a (x y))
     > a
     #<procedure:a>
     > (a 1 2)
     #<a>

To help support this potential feature, the `define-struct' form of
`scheme/base' now accepts a `#:constructor-name' argument to give the
constructor a name other than the one prefixed with `make-'. In
particular, the constructor name can be the same as the type name:

     > (define-struct a (x y))
     > make-a
     #<procedure:make-a>
     > (define-struct a (x y) #:constructor-name a)
     > a
     #<procedure:a>
     > (a 1 2)
     #<a>

A natural (and generally backward-compatible) to change to `match'
would be to treat structure-type names as pattern constructors, so that

     (match (a 1 2)
       [(a x y) x])

would produce 1.


Semi-quasiquote Printing
------------------------

Proposal: Use quasiquote printing as Racket's default printing mode,
but only for transparent values.

Functional programmers long ago figured out that it's better to print a
value in the same way as an expression that produces the value.
Printing with `quasiquote', meanwhile, mostly preserves the Lisp
tradition of printing values that represent expressions as the
expressions that they represent.

Some values, however, cannot be printed easily as expressions that
produce the same value. For example, In DrScheme with quasiquote
printing,

 (list 1 (let ([f (lambda (x) x)]) f))

prints as

 `(1 ,(lambda (a1) ...))

The printer cannot actually print a function, so it has to invent a
`lambda' expression that approximates the value. The problem is worse
with objects, classes, and other opaque types. Expressions with graphs
print as a `shared' expression.

Other implementations of functional languages punt on opaque values.
Here's an example in OCaml, which prints functions as just `<fun>':

 # Some 10;;
 - : int option = Some 10
 # sqrt;;  
 - : float -> float = <fun>
 # [sqrt;sqrt];;       
 - : (float -> float) list = [<fun>; <fun>]

This seems like the right compromise for Racket. For example,

 (list 1 (let ([f (lambda (x) x)]) f))

could print as

 `(1 #<procedure:f>)

(Note that there's no need for an unquote when printing a value as a
non-expression. Non-S-expression forms are "self-unquoting".)

Transparent (or prtially transparent) structures can print with
constructors, while opaque structures can print as non-S-expressions:

   > (define-struct a (x y))
   > (list 1 (a 2 3))
   `(1 #<a>)
   > (define-struct a (x y) #:transparent)
   > (list 1 (a 2 3))
   `(1 ,(a 2 3))

Instances of prefab structure types, meanwhile, should stick to
quasiquoting:

   > (define-struct b (x y) #:prefab)
   > (list 1 (b (a 2 3) 'x))
   `(1 #s(b ,(a 2 3) x))

Graphs can still use the compact #n= notation:

   > (read (open-input-string "#0=(1 . #0#)"))
   `#0=(1 . #0#)

Unlike DrScheme's quasiquote printing, semi-quasiquote printing is
easily implemented by parameterizing our existing printer(s).

A new `print-as-quasiquote' parameter directs `print' and
`pretty-print' to use semi-quasiquote style. (The parameter does not
affect `write'.)

  Welcome to MzScheme v4.2.5.5 [3m], Copyright (c) 2004-2010 PLT Scheme Inc.
  > 'x
  x
  > (print-as-quasiquote #t)
  > 'x
  'x
  > (list 1 2 3)
  `(1 2 3)
  > sqrt
  #<procedure:sqrt>
  > (list 1 sqrt)
  `(1 #<procedure:sqrt>)

The `port-print-handler' and `prop:write' protocols have been changed
(in a mostly backward-compatible way) to make semi-quasiquote printing
extensible.


Language-Specific Run-Time Configuration
----------------------------------------

Proposal: The main language of a program should determine a run-time
configuration, including the style for printing values.

Assuming the changes above, we'd want

  #lang scheme
  (define-struct a (x y) #:transparent)
  (list (make-a 1 2))

to produce

  (#(struct:a 1 2))

while

  #lang racket
  (define-struct a (x y) #:transparent)
  (list (a 1 2))

should produce

  `(,(a 1 2))

Along the same lines, we'd want

  #lang scheme
  (define-struct a (x y) #:transparent)
  (+ 'x (list (make-a 1 2)))

to produce the error message

  +: expects type <number> as 1st argument, given: x; other arguments
  were: (#(struct:a 1 2))

while

  #lang racket
  (define-struct a (x y) #:transparent)
  (+ 'x (list (a 1 2)))

should produce the error message

  +: expects type <number> as 1st argument, given: 'x; other arguments
  were: `(,(a 1 2))

The different `define-struct's are easily support through different
bindings imported by `scheme' and `racket'. Similarly, for printing
top-level results in a module, you might imagine that `scheme' and
`racket' use different printing functions. The different error formats,
however, are not so easily controlled through bindings.

Setting `print-as-quasiquote' to #t is enough to get the Racket-style
error format, but having `#lang racket' inject `(print-as-quasiquote
#t)' in the module top-level would not work well when modules from
different languages are mixed together. For example, if a program
imports both

  ;; s.ss:
  #lang scheme
  (define (s-bad v) (error 's-bad "~e" v))
  (provide s-bad)

and

  ;; r.rkt
  #lang racket
  (define (r-bad v) (error 'r-bad "~e" v))
  (provide r-bad)

the way an error message is printed by `s-bad' and `r-bad' shouldn't
depend on the order that the modules are instantiated.

To accommodate run-time configuration of the environment, such as
setting the way that values are printed, `mzscheme' now treats the main
module of a program specially. It extracts information about the
module's language --- specifically, whether the language declares a
run-time configuration action. If so, `mzscheme' runs the
language-configuration action before it instantiates the module.

As a result, when you put either version of the code above in "ex.ss",
then `mzscheme ex.ss' produces the right error message.

Here's how it works in more detail for the case of `#lang racket':

 * The `racket' module reader has implemented in `racket/lang/reader'
   associates a 'module-language property with `module' form that it
   produces from "ex.ss". The 'module-language property essentially
   points back to `racket/lang/reader'.

 * The macro expander and bytecode compiler preserves the
   'module-language information so that it's available through
   `module-compiled-language-info' (from the unevaluated bytecode)
   and/or `module->language-info' (from the evaluated module
   declaration).

 * When the `mzscheme' executable is given a module to run, it uses
   `module->language-info' to get the module's language information
   before `require'ing the module. The `module->language-info' loads
   "ex.ss" (from source or bytecode) and extracts language info from
   the declared module.

   The language info on the declaration of the module from "ex.ss"
   points back to the `get-info' export of `racket/lang/reader'. The
   `mzscheme' executable calls that function with the
   'configure-runtime key.

 * The `get-info' function of `racket/lang/reader' recognizes the
   'configure-runtime key and reports back the `configure' function
   provided by `racket/private/runtime'.

   [Why doesn't `get-info' just call `configure' directly? See below
    on creating executables.]

 * The `mzscheme' executable calls the `configure' function of
   `racket/private/runtime' calls it. The `configure' function simply
   sets the `print-as-quasiquote' parameter to #t.

 * Having finished running the language's configuration action, the
   `mzscheme' executable `require's the "ex.ss" module to instantiate
   it. (Although `module->language-info' has already loaded the module,
   `module->language-info' doesn't instantiate the module.)

   Instantiating the module runs the expressions in its body,
   triggering the `+' error. The error message uses the right style for
   printing values because the `print-as-quasiquote' parameter was set
   to #t by `configure'.

If you run `mzc --exe ex ex.ss', the generated executable prints the
right error message, too. That's because `mzc' extracts the main
module's language information in the same way as `mzscheme'. Based on
the result for 'confgure-runtime for the module's language, `mzc'
embeds the `racket/private/runtime' module in the generated executable
(and that's why `get-info' doesn't call `configure' itself). The
generated executable includes a start-up action that calls `configure'
before running the main module.


DrScheme should similarly extract language information and call
`configure' before running the module. It may be that a single
side-affecting `configure' function isn't the right interface for
DrScheme, and so experiments with DrScheme may lead to a different
protocol for `mzscheme' and `mzc'.


When `mzscheme' is run in interactive module, the initialization
module's language is used to initialize the run-time configuration. The
`racket', `racket/base' and `racket/init' modules are implemented in
Racket, so

   mzscheme -I racket/init

gives you a REPL like `racket' could give you (when it exists).



Posted on the dev mailing list.