[plt-scheme] keyword arguments, take 2

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Thu Jun 14 01:37:16 EDT 2007
Previous message: [plt-scheme] Has there been an improvement in the GC and/or virtual memory manager since v369.100?
Next message: [plt-scheme] keyword arguments, take 2
Messages sorted by: [date] [thread] [subject] [author]
As promised, here's another attempt to explain a proposed change for
keywords. It's too long --- maybe it suffers from the e-mail equivalent
of second-system syndrome --- but I'm sending it anyway.

Matthew


A View on Keywords
------------------

PLT Scheme currently supports keyword-based arguments via the "kw.ss"
library. Keyword arguments with "kw.ss" work very much like Common Lisp
keyword arguments:

 > (define/kw (greet first #:key [last "Smith"] [hi "Hello"])
     (string-append hi ", " first " " last))
 > (greet "John")
 "Hello, John Smith"
 > (greet "John" #:last "Doe")
 "Hello, John Doe"

That last procedure application has a "Doe" argument that is tagged
with the keyword `#:last'. The first argument, in contrast, is by
position. No argument is supplied in this case with the keyword `#:hi',
but that's ok, because the `#:hi' argument is optional.

If you provide arguments for both `#:last' and `#:hi', you can do so in
either order. Not having to remember an order is the main points of
keyword arguments:

 > (greet "John" #:last "Doe" #:hi "Howdy")
 "Howdy, John Doe"
 > (greet "John" #:hi "Howdy" #:last "Doe")
 "Howdy, John Doe"

The other point is not having to supply N optional arguments to get to
the N+1th optional argument:

 > (greet "John" #:hi "Howdy")
 "Howdy, John Smith"


There are other ways to accomplish the same two goals. For example,
`greet' might take an association list for its options:

 >  (define (greet first options)
      (let ([last (cdr (or (assq 'last options) '(last . "Smith")))]
            [hi (cdr (or (assq 'hi options) '(hi . "Hello")))])
      (string-append hi ", " first " " last)))
 > (greet "John" '((last . "Smith") (hi . "Hello")))
 "Hello, John Smith"

This approach is awkward on the definition side. The programmer takes
on a huge burden for checking that the `options' list makes sense.

It's also inconvenient on the application side. The programmer might
build the list wrong, and the notation is fairly heavyweight.


The definition side of the assoc-list approach is easily fixed with a
macro. The application side is trickier. The improvement that most
obviously fits Scheme syntax is have a some sort of "keyword
application" form that is different from a regular application. That's
the approach that we've taken from the class system's `new' and
`instantiate', for example, where a "keyword" and its argument are
grouped by parentheses. Of course, the syntax has to be set up so that
a parenthesized expression cannot be confused with a keyword--value
combination.

Keywords solve the problem on the application side differently. They
effectively add a new kind of grouping form, at least in the way that
programmers are supposed to think about the syntax. That is, instead of
always using parentheses to group things, sometimes keyword group
things.


Debate Layer #1: The Purpose and Value of Keywords
--------------------------------------------------

A first point of potential debate is whether my view of keywords --- a
grouping alternative to parentheses --- is the way programmers should
think about them.

If I have that right, then we can insert a long debate about whether
it's a good idea to have a new kind of grouping form:

 * Avoiding keywords is simpler.

 * Having keywords helps remove a layer of parens --- especially around
   pieces that are more commonly used --- making the overall program
   ore readable to many people.

I doubt that there's any right answer here; I think it's a matter of
opinion. My opinion has moved over time in the direction of using
keywords a bit more than we do.


The Problem with CL-style Keywords
----------------------------------

My discomfort with keywords, as we have them now, is that they're only
a grouping form within an application if you use them in the right way.

If you write

 > (greet #:last "Doe")
 greet: expecting a (#:last) keyword got: "Doe"
 > (greet "John" #:hi #:last "Doe")
 greet: expecting a (#:last #:hi) keyword got: "Doe"

then neither use of `#last' actually groups anything. More generally,
whether a keyword groups things in an application is a *dynamic*
property, not a static property of the expression.

Of course, "dynamic" is often good, but grouping syntax is not one of
the places where the language should be dynamic. I would not want, for
example, `(foo 10)' to sometimes mean that `foo' is applied to 10 and
sometimes mean that `foo' and `10' are consecutive arguments in an
function call.


The awkwardness of keyword-based procedures became more apparent for me
when I tried to document an extension of `lambda' that supports keyword
arguments; this attempt was based on my use of "kw.ss" combined with
the style of SRFI-89 (which makes declarations of keyword-using
functions more symmetric to function calls). I ended up with a strange
specification that allowed keyword values in the list for a rest
argument *unless* a keyword argument is specified, and so on.


A similar sort of awkwardness shows up when using keywords in syntactic
forms. See Ryan's `define-struct*':

 http://planet.plt-scheme.org/package-source/ryanc/macros.plt/1/0/doc.txt

Ryan included parens around keywords because he's worried about errors
and the confusion that can result from a keyword being parsed as an
expression. That's a good motivation, but it works against the
syntactic style of keywords that I'd prefer to see.


Debate Layer #2: Whether Keywords Currently are Good Enough
-----------------------------------------------------------

I say that they're not good enough, but we haven't had this debate. (We
skipped on to a debate about one possible solution...)


Keywords as Non-Expressions
---------------------------

Many of these issues with keywords can be resolved by revoking the
self-quoting behavior of keywords. That is, a literal keyword wouldn't
be allowed in an expression position.

Of course, keywords still need to be first-class values, for a variety
of reasons; quoting a literal keyword forms an expression that produces
the keyword.

(In the case of symbols, we have the helpful terminology of "symbols"
versus "identifiers". Something like that would be useful for keywords,
but I don't have good words.)

This change should alleviate Ryan's concern with keywords being
confused with expressions. More generally, it paves the way for using
keywords as lexical grouping syntax that cannot be confused with
expressions, which is the goal I have in mind.


Keyword Arguments and Non-Expression Keywords
---------------------------------------------

If keywords are not expression, then 

  (greet "John" #:last "Doe")

needs a new kind of parsing. No problem; we can re-define `#%app' to
generalize procedure-application parsing to detect keywords.

Simply saying that a keyword isn't an expression isn't quite enough,
though. For example, what does

 (let ([kw '#:last])
   (greet "John" kw "Doe"))

mean? In the current system, "Doe" turns out to be an argument with the
#:last keyword, because the association is dynamic. But that's what I'm
trying to get away from. That is, in the same way that I don't want
`(foo 10)' to turn out to be two consecutive arguments in a function
call, I don't want `foo 10' to turn out to be a grouping (if `foo'
evaluates to a keyword value) instead of consecutive arguments.


A way around this problem is to keep by-position and by-keyword
arguments separate. Then, using a literal keyword in an application
form is always a lexical grouping that implies a keyword--argument
pair. Anything other than a literal keyword (and not after a literal
keyword) is an expression for a by-position argument.

This is consistent with the use of keywords for syntax, and it supports
better error reporting:

 >  (greet #:last "Doe")
 procedure application: no case matching 0 non-keyword arguments for:
 #<procedure:greet>; arguments were: #:last "Doe"

 > (greet "John" #:hi #:last "Doe")
 expand: a keyword is not an expression in: #:last

 > (let ([weird '#:hi])
     (greet "John" weird "Hi"))
 procedure greet: expects 1 argument, given 3: "John" #:hi "Hi"

It also makes keyword arguments more flexible, since they can appear in
any order relative to non-keyword arguments.


Concrete Proposal
-----------------

The proposal that I've put forward combines the separation of
by-position and by-keyword arguments with a different style of
declaring keyword-argument procedures.

To summarize the draft docs:

 * Keyword literals are not self-quoting.

 * The syntax of an application is

       (proc-expr arg ...)

       arg = arg-expr
           | arg-keyword arg-expr

   The order of `arg's that contain keywords doesn't matter, except to
   determine the order in which the `arg-expr's are evaluated. Keyed
   arguments can appear before all by-position arguments, or after all
   by-position arguments, or mixed together.

   Examples:
    > (greet "John" #:last "Doe" #:hi "Howdy")
    > (greet #:hi "Howdy" "John" #:last "Doe")
    > (go "super.ss" #:mode 'fast)

 * The syntax of `lambda' is extended as follows

      (lambda gen-formals
         body ...+)

      gen-formals = (arg ...)
                  | rest-id
                  | (arg ...+ . rest-id)

      arg = arg-id
          | [arg-id default-expr]
          | arg-keyword arg-id
          | arg-keyword [arg-id default-expr]

   A constraint not shown in the grammar is that an `arg-id' as `arg'
   cannot follow a `[arg-is default-expr]' as `arg' (though it can
   follow a `arg-keyword [arg-id default-expr]'). Each `default-expr'
   can refer to any identifier that is bound earlier.

   Examples:
    > (define substring*
        (lambda (str start 
                     [end (string-length str)]
                     #:encode [converter values])
           (converter (substring str start end))))
    > (substring* "seesaw" 3)
    "saw"
    > (substring* "seesaw" 0 3)
    "see"
    > (substring* " \u3BB" 1 #:encode string->bytes/utf-8)
    #"\316\273"

 * `apply' is unchanged, which the understanding that the supplied
   arguments are all by-position arguments. Similarly, "rest" args are
   unchanged, with the understanding that they receive only
   by-position arguments.

 * A new `keyword-apply' supports keyword arguments in addition to
   by-position arguments. A new `make-keyword-procedure' supports the
   construction of a procedure with a "rest keyword" argument.

For more details, start here:

 http://www.cs.utah.edu/~mflatt/tmp/newdoc/guide/guide_application.html
 http://www.cs.utah.edu/~mflatt/tmp/newdoc/guide/guide_lambda.html

Some fine points have changed since my earlier post.


To complement the docs, you can now find a prototype implementation at

 http://svn.plt-scheme.org/plt/trunk/collects/scribblings/new-lambda.ss


Debate Layer #3: The Concrete Proposal
--------------------------------------

Most discussion so far has been about the proposed design, bypassing
the debates on whether there are problems to solve. I've tried to draw
out the other debates above; it seems unlikely to me that we agree on
everything but the fine points of a solution.

In terms of the proposal, though, the issue that has drawn the most
discussion is the relationship between a plain "procedure" and a
"procedure that accepts keyword arguments".


Eli has pointed out the importance of being able to add an optional
keyword-based argument at any time. I completely agree with this goal,
and it's met by the proposal. A procedure whose keyword arguments are
all optional can always be used in a regular procedure call. That is,
procedures-with-optional-keyword-arguments is a subtype of procedure.


A procedure with at least one *required* keyword argument cannot be
used in a call without keyword arguments, of course. If someone adds a
required keyword argument to a public procedure, then obviously it can
no longer be used as it was before. A subtle design point is whether
procedure?' should return #t for such things; current opinion seems to
be for #t --- it's a procedure that happens to immediately raise a
missing-keyword exception. (The prototype and documentation reflect
that choice.)


Eli's main concern, if I understand, is that the combination
    <procedure-accepting-kws, arguments-with-kws>
is not a subtype of the combination
    <plain-procedure, arguments>
That is, if you have something like `trace' that takes both a procedure
and arguments externally, then it can't take a keyword-procedure and
keyword-arguments and combine them in the same way.

If that's an accurate summary, then Eli and I disagree on how serious a
problem this is. I note that `<method, arguments>' combinations have
the same problem, and I think there must be other similar kinds of
higher-order data. In many cases, I think it will be perfectly
reasonable for a library to support only plain procedures; in other
cases, I think improving the library is not a big deal; and in other
cases, notably the contract and object systems, I expect deeper
changes, and that's ok with me.

This point has gotten mixed up with the `procedure?' question. The
issue here is not whether the keyword-requiring thing is a valid
argument to `apply', but whether some intended arguments can be
communicated to the thing. If the arguments must be communicated as
keyword arguments, then they can't get there via `apply'; they'd need
to be sent by `keyword-apply'.
Posted on the users mailing list.
Previous message: [plt-scheme] Has there been an improvement in the GC and/or virtual memory manager since v369.100?
Next message: [plt-scheme] keyword arguments, take 2
Messages sorted by: [date] [thread] [subject] [author]