[plt-dev] new language dialog, iii

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Sat Jan 30 18:39:07 EST 2010

On Sat, Jan 30, 2010 at 5:28 PM, Eli Barzilay <eli at barzilay.org> wrote:
> Quick note -- the "custom" text is probably better if it's bracketed
> or something.  Since when I have
>
>  Language: typed-scheme custom; memory limit: 256 megabytes.
>
> it looks wrong, since the "custom" part is not related to
> typed-scheme.  Maybe instead of brackets, something like:
>
>  Language: typed-scheme; custom settings; memory limit: 256 megabytes.
>
> (Maybe shorten " megabytes" to "MB" too, to avoid getting it
> inconveniently long.)

Yeah, okay.

> On Jan 30, Robby Findler wrote:
>> On Fri, Jan 29, 2010 at 3:58 PM, Eli Barzilay <eli at barzilay.org> wrote:
>> >> > I'm thinking of something like
>> >> >
>> >> >
>> >> >  #lang some-language #!bleh
>> >> >
>> >> > where the `#!bleh' is part of the `some-language' specification.
>> >>
>> >> But the string is limited to what was consumed, so the regexp won't be
>> >> applied to that part of the string or so I thought.
>> >
>> > The thing that gets consumed is *anything* that the language reader
>> > decides to read to make a decision about all of its settings.  In the
>> > above case, the code dispatches to the reader in
>> > `some-language/lang/reader' after reading `some-lang', then that code
>> > can read *anything* it wants and return the info function.  So the
>> > point where reading reached is the point that marks all the necessary
>> > text to determine the language and its settings.
>>
>> Oh, of course. I see. That's fixed now.
>
> One new problem: you should probably normalize the text with something
> like (regexp-replace* #px"\\s+" text " ").  Technically it's
> incorrect, but this is just a human-readable, and having a newline
> there looks very weird.  Try this:
>
>  #lang reader
>  scheme/lang/reader
>  '...code...
>
> As for your fix -- you did it in a way that is different -- you allow
> any text leading upto the "#lang".  My guess is that you're trying to
> accommodate files with a comment at the top:
>
>  ;; blah blah
>  #lang scheme
>
> It's questionable to allow this (I think it makes sense to specify
> exactly which kind of comments are allowed before a `#lang' line; even
> forbidding them could be fine if there wasn't a problem with script
> "#!/..." magics and "#|...|#" trampolines).  In any case, perhaps it's
> best to have some "skip whitespace and comments" function from
> mzscheme to deal with this situation.  Matthew: is this difficult to
> get?
>
> In any case, there is at least this case which I expect to be not too
> uncommon, and is currently very messed up.  Imagine me writing some
> code with `scheme/base' and temporarily switching to `scheme' to make
> it more convenient.  I can see this ending up with this code at the
> top:
>
>  ;; #lang scheme/base  FIXME: need to make it work with this
>  #lang scheme
>
> So I think that the conservative approximation that I did (skipping
> only whitespace) is the best option until there's some reliable way to
> skip over comments.
>
> And one more issue -- try this code (with the two spaces):
>
>  #lang  scheme
>
> and the text following "Language:" is empty.  (Looks like catching
> exceptsions too eagerly?)
>
> (BTW, the escape continuation thing works better than what I tried,
> for some I read the guard docs as saying that it *shouldn't* escape
> explicitly...)

Oh, right. I didn't consider that.

I guess I have to implement a Scheme comment parser in drscheme now.

Blecch. I'm going to leave it alone for a while in the hopes that the
specification gets simpler.

>> I probably should be using a sandbox too, so that when I run this
>> untrusted code I always get a result back.
>>
>> That's probably a useful thing to package up into a library, ie
>> (read-language/sandbox ...) which is guaranteed to terminate without
>> killing anything and perhaps the function it returns also does the
>> calls inside the same sandbox.
>
> No, I don't think that a sandbox is right in this case.  The thing is
> that it might need to consult code anywhere in the filesystem, or do
> whatever it wants to do to come up with the language.  So if I were
> trying to use this in the context of some utility that uses a sandbox
> then I wouldn't use *another* sandbox to do the `#lang' -- instead,
> I'd arrange for that parsing to happen in the existing user sandbox.
> Translating this to drscheme -- I think that it's best if you call
> this function from the user context, so it's subject to the usual
> restrictions that drscheme puts on user code.

I can't call this from the user context (or at least I need to make a
new one). DrScheme should run your program just like running it in
mred-text (or whatever) would. It should also occasionally do extra
evaluation here and there.

> Re packaging -- given the amount of subtleties that need to be
> addressed, it definitely should go in some library.  But what I don't
> know exactly where to put it.  Is `syntax/lang-utils' too lame?

syntax/ seems wrong, but I don't have any good ideas.

Robby


Posted on the dev mailing list.