[plt-dev] "poll": scribble syntax & indentation

From: Eli Barzilay (eli at barzilay.org)
Date: Fri Mar 6 14:36:08 EST 2009

This is not really a poll, but rather a long winded attempt to see if
anyone has any good ideas for a solution.  (Or me thinking out loud.)
I've been staring at this problem for a while now, and I'm slowly
converging on a solution that is not as good as I'd hope.

The problem is that trying to use the scribble syntax for cases where
indentation matters is difficult right now.  This is most obvious if
you try to do use the scribble/text preprocessor language to generate
code.  [It doesn't come up with the usual documentation languages,
because they're mostly insensitive for indentation -- but it might be
an issue in the future there too, if there is some @scheme{...} macro
that operates on the raw text (like drscheme) rather than the current
@scheme[...] which uses syntax source information (like slideshow).]

For example, this:

  #lang scribble/text
  function foo() {
    @list{if (1 < 2)
            something1
          else
            something2
          fi
          }
    return
  }

will not come out right.  What's missing is some function to use
instead of `list' which will somehow grab its initial output column
and use it for all nested newlines.  The scribble reader does put
syntax properties on what it reads which makes it seem like there is
some solution that involves a macro which will use that information,
but that won't work in general because the *source* of the misindented
text can come from a different part of the code -- one that is at an
arbitrarily different position.  An example of this (the
@@if{text1}{text2}{text3} uses a curried `if' binding, each block of
text happens at a different curried level) is:

  #!/bin/env mzscheme
  #lang scribble/text
  
  @(define (((if . c) . t) . e)
     @list{
       if (@c)
         @t
       else
         @e})
  
  function foo() {
    @list{if (1 < 2)
            something1
          else
            @@@if{2<3}{something2}{something3}
            repeat 3 {
              @@@if{2<3}{something2}{something3}
            }
          fi
          }
    return
  }

In this example the `if' function is used in two different places,
each with a different indentation -- and the definition obviously has
a fixed indentation.

This makes me think that as much as I'd wish this to be done at syntax
time through a macro, I'd have to do something dynamic -- some
`with-indentation' function which knows where in the output stream it
should be printed, and adds this to newlines in its body.  Even worse,
it will need to divert its body to a temporary buffer, then convert
the newlines there because you could just as well produce newlines in
Scheme expressions instead of being part of the input -- for example,
if this was used above:

            @@@if{2<3}{@"something2.1\nsomething2.2"}{something3}

then you'd probably want that newline to get the right indentation
injected too.  The first reason that I don't like this soution is that
it will be specific to the preprocessor output functionality, which
will need to turn on location tracking and make the location available
for that `with-indentation' function.

But it gets even more complicated.  Consider this example:

  #lang scribble/text
  function foo() {
    var lst = [@list{item1,
                     item2}]
    // @list{comment1
             comment2}
    return
  }

Ideally, this should output:

  function foo() {
    var lst = [item1,
               item2]
    // comment1
    // comment2
    return
  }

which means that there should be two functions: the first `list'
should use `with-indentation' so the items get indented properly, and
the second should use some other `with-prefix' function which uses the
prefix preceding it instead of just the indentation.

This makes me think of an even worse solution -- instead of just
turning on location tracking, the scribble/text output should use a
custom port that keeps track of the contents of the current line
before the current output location.  This is bad because

1. It makes it an even more scribble/text-specific solution

2. It'll make it slow an complex

3. And most importantly, the beauty of the scribble syntax was in
   getting away from this kind of low-level hacking, and I'd really
   prefer some solution that doesn't mix the output port details with
   the code in this way.

I'd appreciate any comments and ideas anyone might have.


(BTW, yes, I looked at what the Cheetah framework does; it's a
horrible hack, much worse than the above.  See
http://www.cheetahtemplate.org/docs/TODO for details.)

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the dev mailing list.