[plt-dev] new language dialog, iii

From: Eli Barzilay (eli at barzilay.org)
Date: Sat Jan 30 18:28:48 EST 2010

Quick note -- the "custom" text is probably better if it's bracketed
or something.  Since when I have

  Language: typed-scheme custom; memory limit: 256 megabytes.

it looks wrong, since the "custom" part is not related to
typed-scheme.  Maybe instead of brackets, something like:

  Language: typed-scheme; custom settings; memory limit: 256 megabytes.

(Maybe shorten " megabytes" to "MB" too, to avoid getting it
inconveniently long.)


On Jan 30, Robby Findler wrote:
> On Fri, Jan 29, 2010 at 3:58 PM, Eli Barzilay <eli at barzilay.org> wrote:
> >> > I'm thinking of something like
> >> >
> >> >
> >> >  #lang some-language #!bleh
> >> >
> >> > where the `#!bleh' is part of the `some-language' specification.
> >>
> >> But the string is limited to what was consumed, so the regexp won't be
> >> applied to that part of the string or so I thought.
> >
> > The thing that gets consumed is *anything* that the language reader
> > decides to read to make a decision about all of its settings.  In the
> > above case, the code dispatches to the reader in
> > `some-language/lang/reader' after reading `some-lang', then that code
> > can read *anything* it wants and return the info function.  So the
> > point where reading reached is the point that marks all the necessary
> > text to determine the language and its settings.
> 
> Oh, of course. I see. That's fixed now.

One new problem: you should probably normalize the text with something
like (regexp-replace* #px"\\s+" text " ").  Technically it's
incorrect, but this is just a human-readable, and having a newline
there looks very weird.  Try this:

  #lang reader
  scheme/lang/reader
  '...code...

As for your fix -- you did it in a way that is different -- you allow
any text leading upto the "#lang".  My guess is that you're trying to
accommodate files with a comment at the top:

  ;; blah blah
  #lang scheme

It's questionable to allow this (I think it makes sense to specify
exactly which kind of comments are allowed before a `#lang' line; even
forbidding them could be fine if there wasn't a problem with script
"#!/..." magics and "#|...|#" trampolines).  In any case, perhaps it's
best to have some "skip whitespace and comments" function from
mzscheme to deal with this situation.  Matthew: is this difficult to
get?

In any case, there is at least this case which I expect to be not too
uncommon, and is currently very messed up.  Imagine me writing some
code with `scheme/base' and temporarily switching to `scheme' to make
it more convenient.  I can see this ending up with this code at the
top:

  ;; #lang scheme/base  FIXME: need to make it work with this
  #lang scheme

So I think that the conservative approximation that I did (skipping
only whitespace) is the best option until there's some reliable way to
skip over comments.

And one more issue -- try this code (with the two spaces):

  #lang  scheme

and the text following "Language:" is empty.  (Looks like catching
exceptsions too eagerly?)

(BTW, the escape continuation thing works better than what I tried,
for some I read the guard docs as saying that it *shouldn't* escape
explicitly...)


> I probably should be using a sandbox too, so that when I run this
> untrusted code I always get a result back.
> 
> That's probably a useful thing to package up into a library, ie
> (read-language/sandbox ...) which is guaranteed to terminate without
> killing anything and perhaps the function it returns also does the
> calls inside the same sandbox.

No, I don't think that a sandbox is right in this case.  The thing is
that it might need to consult code anywhere in the filesystem, or do
whatever it wants to do to come up with the language.  So if I were
trying to use this in the context of some utility that uses a sandbox
then I wouldn't use *another* sandbox to do the `#lang' -- instead,
I'd arrange for that parsing to happen in the existing user sandbox.
Translating this to drscheme -- I think that it's best if you call
this function from the user context, so it's subject to the usual
restrictions that drscheme puts on user code.

Re packaging -- given the amount of subtleties that need to be
addressed, it definitely should go in some library.  But what I don't
know exactly where to put it.  Is `syntax/lang-utils' too lame?

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!


Posted on the dev mailing list.