[racket-dev] [racket] keyword args static checking and optimization

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Mon Aug 8 11:05:46 EDT 2011

[Moved to the dev list.]

At Sat, 06 Aug 2011 07:25:00 -0400, Neil Van Dyke wrote:
> Feature request... I'd *really* like to see compile-time checking of 
> keyword arguments whenever that is possible.
> If compiler knows what procedure will be called, and the procedure uses 
> keyword args in the usual way, then I'd like the compiler to report an 
> error when the call site, say, uses a keyword arg that the procedure 
> doesn't support.  Likewise with required keyword args that are missing.
> As a second feature request, would be nice if, when the compiler (or 
> JIT) can determine the procedure, if it could optimize the keyword args 
> the same as if they were positional args.  I don't know how much the 
> compiler/JIT is doing already, but the static error-checking that it 
> misses make me suspect the compiler is not optimizing this.

The compiler proper knows nothing about keyword functions and function
calls. They're implemented by macros and expanded away into plain
functions and applicable structures.

Instead, argument checking can be pushed into the macro expansion of
keyword arguments. The idea is that `(define id <lambda with keyword
arguments>)' can bind `id' as syntax that checks and optimizes
first-order uses of `id'.

In more detail,

  (define f (lambda (a [b 1] #:c c #:d [d 3]) ....))

expands to

  (define (core a have-b? b c have-d? d)
    (let* ([b (if have-b? b 1)]
           [d (if have-d? d 3)])

  (define proc 
    (make-keyword-procedure (lambda .... (core ....))))

  (define-syntax (f stx)
    (if ... application looks ok? ....
        (core ....)   ; direct call; no keyword checking or packaging
          ... issue warning ...
          (proc ....)))) ; existing protocol

so that

  (f 0 1 #:c 2)

expands to

  (core 0 #t 1 2 #f #f)

The macro approach has some drawbacks:

 * It's not quite as general as a warning from the compiler's
   optimization pass, which can detect some higher-order uses through
   copy propagation and inlining. A first-order check covers most cases
   in practice, though.

 * Macros don't compose as nicely. Because of the way that macro
   expansion is ordered in a definition context, `define' can't force
   the expansion of its right-hand size to check whether it expands to
   `lambda'. Instead, `define' can only recognize immediate `lambda'
   forms. Again, that's probably good enough to be useful in practice.

 * The `class' and `unit' forms expect `define' to bind a variable and
   not syntax, because they rewrite definitions based on the connection
   between an identifier with `define' and an identifier written in a
   signature or a `public' clause.

   To avoid this problem, the `define' form can require some
   cooperation from definition contexts. A definition context that is
   implemented via `local-expand' declares its willing to work with the
   non-variable expansion by giving its context representation the
   `prop:liberal-define-context' property. The internal-definition
   contexts that are built into `lambda, `let', etc., all set that
   property, while the `class' and `unit' forms do not.

 * Reflection creates the usual sort of trouble:

     (define f (lambda (#:x x) '....))
     (namespace-variable-value 'f #f #f

   I don't mind weakening reflection at this level; it seems ok to say
   that `define' creates a syntax binding for keyword functions (in a
   liberal definition context).

 * Mutation creates deep trouble:

     (define f (lambda (#:x x) ....))
     (set! f (lambda (#:y y) y))
     (f #:y 12)

   One option is to disallow `set!' on an identifier that is bound to a
   keyword-accepting procedure. That seems awkward, and it seems like
   it would compose badly. I'm not as willing to sacrifice `set!' as I
   am to sacrifice reflection.

   Another possibility is to redirect the `set!' on `f' to the
   underlying `proc', and somehow make the optimized call to `core'
   happen only when `proc' is never mutated. Due to the order of macro
   expansion, whether `f' is mutated is not necessarily known when a
   call to `f' is expanded. The expansion of a call to `f' would have
   to embed the condition that `proc' is not mutated.

   We already have `#%variable-reference' to reflect information about
   variables into an expressions; to make it work in all definition
   contexts, `#%variable-reference' must be generalized to work with
   local variables, but that's a relatively minor change. Then, a
   `variable-reference-constant?' procedure can report the constantness
   of a variable.

   With those pieces, and when redirecting mutations of `f' to `proc',
   a call to `f' could expand to

       (if (variable-reference-constant? (#%variable-reference proc))
           (core ....)
           (proc .....))

   In some cases, especially for local bindings, the compiler can
   statically determine whether a variable is constant and eliminate
   one of the branches. For module-level variables, the compiler cannot
   see through `make-keyword-procedure' well enough to determine that
   `proc' is definitely initialized before it is used.

I've implemented all of this (not yet pushed). It's more complex than I
originally hoped, and I'm not yet sure it's worthwhile. Longer term,
maybe it's better to work on ways for macros to more directly
communicate with the optimizer.

Posted on the dev mailing list.