[plt-scheme] read-delimited-list

From: Eli Barzilay (eli at barzilay.org)
Date: Mon Aug 21 14:03:42 EDT 2006

On Aug 20, Jos Koot wrote:
> Hi Matthias and Eli,
> My earlier attempts to adapt readtables failed because I failed to
> understand the docs. Your examples in this thread are very
> clear. May be they deserve to be included in the help desk.

Perhaps.  The information is all there -- the problem is that it is in
reference form rather than a tutorial form.  (And a lot of discussions
went on on that issue, the cookbook is one example, wikifying the docs
is another possible solution.)


> I am facinated by the new world of tools for the definition of
> customized languages with just a tiny bit of programming. Yet I have
> two questions as displayed in the example below. Be assured that
> they are not ment to be interpreted as a critic. I merely wonder
> whether ot not it would be possible to tune MzScheme's read-tables
> such as to avoid the pittfalls I fell into.

Keep in mind that using readtables is different than using a new
syntax using the parser tools -- the main advantage is that they allow
you to hook onto (Mz)Scheme's syntax and extend it in a local way, but
that leads to some restrictions too.

> ;;; module my-reader
> ;;; {var ... expr} ==> (lambda (var ...) expr) ; mark the braces
> ;;; ~ expr ==> (some-syntax expr)

The second should be very easy to implement, the first could require a
little more work than Paul's problem because you can't just read the
whole sub-expression and use it as is -- but you solve this by doing
the massaging in the `special-lambda' syntax.  In any case, your code
demonstrates some of the additional issues that I didn't go into in
the previous post, so hopefully this would be helpful in further
clarifications (so it will also be verbose).

[re-editing your code for readability]

> (module my-reader mzscheme
>   (provide special-lambda)
>   ;; it would be nice not to be obliged to export this syntax
>   ;; (question 1: is there a way???)
>   (define-syntax special-lambda
>     (syntax-rules ()
>       [(special-lambda (var ... expr)) (lambda (var ...) expr)]))

This is related to the main problem in your code.  When you're using
any syntactic extensions in PLT, it is best to think about the result
as adding yet another dimension to the layers that make up your code.
The usual dimension is macros: in this dimension there is the phase
that is associated with every piece of code: the run-time code, the
syntax level, the second syntax level etc.  Matthew's "You want it
when" paper describes the resulting tower of phases, and the fact that
MzScheme tries to keep a strict separation of phases.

With the addition of reader code, you get another dimension to this
story -- the reader code is above (or below, depending on your POV)
all of these phases.  Because it is in this position, it should be
kept separated from the rest of the code.  Of course, you *could* mix
in the levels, and that can lead to some usual mixed messes.  For
example:

  > (define counter (let ([c 0]) (lambda () (set! c (add1 c)) c)))
  > (define dollar-reader (lambda _ (counter)))
  > (current-readtable
     (make-readtable #f #\$ 'terminating-macro dollar-reader))
  > (list $ $ $)
  (1 2 3)
  > (list $ (counter) $)
  (4 6 5)

I think that there are very few uses for the mess that you see in that
last interaction.  To continue with your code -- the problem is that
you're using a single module for reader functionality, as well as
bindings that this reader should be using.  It is best to keep the two
separate: one module to provide the reader extensions and another
module to give them meaning.  You *could* use the same module for
both, but if you do so you have to keep in mind that this is a
hack...  In your case, the `special-lambda' syntax binding is
irrelevant for the reader code so there is no need for it at
read-time, and OTOH, when you consider the *meaning* of your code
(bindings (both values and syntaxes)) there is no need for the reader
code.


>   (define read-braces
>     (case-lambda
>       [(ch port src line col pos)
>        #`(special-lambda #,(read/recursive port #\{ #f))]
>       ;; very nice: unquote-splicing-syntax accepted
>       [(ch port) `(special-lambda ,(read/recursive port #\{ #f))]))

Here you have another problem that is related to this issue, and a
pitfall that would probably confuse several people.  The difference
between the first and second cases is that you use syntax in the
first, and a quoted s-expression in the second.  You may have noticed
that in Matthew's initial reply, he used only a quasiquoted
s-expression for the reader.  This is an important but subtle point
that is a problem in your code.  The thing is that the reader should
return simple syntax, with no lexical information.  In your case, the
use of `quasisyntax' makes the reader return a syntax that already has
lexical context.  Here's a simple example:

  > (module foo mzscheme
      (define a 1)
      (define (read-dollar . _) #'a)
      (current-readtable
       (make-readtable #f #\$ 'terminating-macro read-dollar)))
  > (require foo)
  > $
  stdin::66: compile: access from an uncertified context to unexported
  variable from module: foo in: a

The problem is that when the reader gets to the "$", it gets a syntax
that already has lexical information -- in this case it's a direct
referral to the `a' that is not exported from the module.  (The
resulting syntax is `3d'.)  The result is a certificate error, similar
to manually expanding a macro reasult and trying to expose a hidden
identifier that way.  Another example that shows this:

  > (module foo mzscheme
      (define a 1) (provide (rename a b))
      (define (read-dollar . _) #'a)
      (current-readtable
       (make-readtable #f #\$ 'terminating-macro read-dollar)))
  > (require foo)
  > $
  1
  > '$
  a
  > (let ([a 123]) $)
  123
  > (let ([b 123]) $)
  1

The rule of thumb is that reader code should *never* use any of the
`syntax' forms.  It is fine, however, to use `datum->syntax-object' --
as long as the first argument is `#f' so the result has no lexical
information.  When you return a non-syntax value, it is converted to
syntax using `datum->syntax-object' and `#f' for the lexical
information, so it is fine to do that too.

(This is all described in the "Procedure result" part of section
11.2.9 in the MzScheme manual.)


>   ;; unquote-splicing not accepted (understandible, but not
>   ;; consistent, this is question 2)

I don't know what this is referring too -- I see no splicing in your
code.


>   (define read-tilde
>     (case-lambda
>       [(ch port src line col pos)
>        #`(some-syntax
>           #,(read/recursive port #f (current-readtable)))]

Notes:

* #f and (current-readtable) are defaults, so you could use
  (read/recursive port)

* Suffers from the same problem as above (use s-expressions instead)

* But you should also change `read/recursive' to
  `read-syntax/recursive'.  The difference is that the latter will
  give you syntax objects that contain proper source information.


> [...]
> unquote-splicing: expected argument of type <proper list>; given 
> #<placeholder>

Maybe you pasted code that is different than what you've used?  In any
case, the deal with `read/recursive' and `read-syntax/recursive' is
that they almost always return an opaque `placeholder' value that you
cannot use.  These values are later changed when graph notations (#N=
and #N#) are resolved.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the users mailing list.