[racket-dev] [racket] keyword args static checking and optimization
[Moved to the dev list.]
At Sat, 06 Aug 2011 07:25:00 -0400, Neil Van Dyke wrote:
> Feature request... I'd *really* like to see compile-time checking of
> keyword arguments whenever that is possible.
>
> If compiler knows what procedure will be called, and the procedure uses
> keyword args in the usual way, then I'd like the compiler to report an
> error when the call site, say, uses a keyword arg that the procedure
> doesn't support. Likewise with required keyword args that are missing.
>
> As a second feature request, would be nice if, when the compiler (or
> JIT) can determine the procedure, if it could optimize the keyword args
> the same as if they were positional args. I don't know how much the
> compiler/JIT is doing already, but the static error-checking that it
> misses make me suspect the compiler is not optimizing this.
The compiler proper knows nothing about keyword functions and function
calls. They're implemented by macros and expanded away into plain
functions and applicable structures.
Instead, argument checking can be pushed into the macro expansion of
keyword arguments. The idea is that `(define id <lambda with keyword
arguments>)' can bind `id' as syntax that checks and optimizes
first-order uses of `id'.
In more detail,
(define f (lambda (a [b 1] #:c c #:d [d 3]) ....))
expands to
(define (core a have-b? b c have-d? d)
(let* ([b (if have-b? b 1)]
[d (if have-d? d 3)])
....))
(define proc
(make-keyword-procedure (lambda .... (core ....))))
(define-syntax (f stx)
(if ... application looks ok? ....
(core ....) ; direct call; no keyword checking or packaging
(begin
... issue warning ...
(proc ....)))) ; existing protocol
so that
(f 0 1 #:c 2)
expands to
(core 0 #t 1 2 #f #f)
The macro approach has some drawbacks:
* It's not quite as general as a warning from the compiler's
optimization pass, which can detect some higher-order uses through
copy propagation and inlining. A first-order check covers most cases
in practice, though.
* Macros don't compose as nicely. Because of the way that macro
expansion is ordered in a definition context, `define' can't force
the expansion of its right-hand size to check whether it expands to
`lambda'. Instead, `define' can only recognize immediate `lambda'
forms. Again, that's probably good enough to be useful in practice.
* The `class' and `unit' forms expect `define' to bind a variable and
not syntax, because they rewrite definitions based on the connection
between an identifier with `define' and an identifier written in a
signature or a `public' clause.
To avoid this problem, the `define' form can require some
cooperation from definition contexts. A definition context that is
implemented via `local-expand' declares its willing to work with the
non-variable expansion by giving its context representation the
`prop:liberal-define-context' property. The internal-definition
contexts that are built into `lambda, `let', etc., all set that
property, while the `class' and `unit' forms do not.
* Reflection creates the usual sort of trouble:
(define f (lambda (#:x x) '....))
(namespace-variable-value 'f #f #f
(variable-reference->namespace
(#%variable-reference)))
I don't mind weakening reflection at this level; it seems ok to say
that `define' creates a syntax binding for keyword functions (in a
liberal definition context).
* Mutation creates deep trouble:
(define f (lambda (#:x x) ....))
(set! f (lambda (#:y y) y))
(f #:y 12)
One option is to disallow `set!' on an identifier that is bound to a
keyword-accepting procedure. That seems awkward, and it seems like
it would compose badly. I'm not as willing to sacrifice `set!' as I
am to sacrifice reflection.
Another possibility is to redirect the `set!' on `f' to the
underlying `proc', and somehow make the optimized call to `core'
happen only when `proc' is never mutated. Due to the order of macro
expansion, whether `f' is mutated is not necessarily known when a
call to `f' is expanded. The expansion of a call to `f' would have
to embed the condition that `proc' is not mutated.
We already have `#%variable-reference' to reflect information about
variables into an expressions; to make it work in all definition
contexts, `#%variable-reference' must be generalized to work with
local variables, but that's a relatively minor change. Then, a
`variable-reference-constant?' procedure can report the constantness
of a variable.
With those pieces, and when redirecting mutations of `f' to `proc',
a call to `f' could expand to
(if (variable-reference-constant? (#%variable-reference proc))
(core ....)
(proc .....))
In some cases, especially for local bindings, the compiler can
statically determine whether a variable is constant and eliminate
one of the branches. For module-level variables, the compiler cannot
see through `make-keyword-procedure' well enough to determine that
`proc' is definitely initialized before it is used.
I've implemented all of this (not yet pushed). It's more complex than I
originally hoped, and I'm not yet sure it's worthwhile. Longer term,
maybe it's better to work on ways for macros to more directly
communicate with the optimizer.