[plt-scheme] R6RS Interoperability
On Fri, Jun 27, 2008 at 09:26:32AM -0400, Matthias Felleisen wrote:
>
> On Jun 27, 2008, at 2:35 AM, Mark Engelberg wrote:
>
> >Since performance is a key goal of gradual typing,
>
> Who says?
>
> ;; ---
>
> In case people are interested, I can try to compile a rather
> extensive literature list on compiling w/ or w/o types for languages
> such as ML and friends.
That might be interesting. More interesting would be a readable summary
of what worked and didn't.
> Just as a teaser; until 12 years ago (or so),
> compilers for polymorphic languages just stripped types away once
> they were done and used Scheme-like back-ends.
Perfectly reasonable if types were seen as a way to achieve correctness
rather than speed. And, if I recall correctly, the main thrust of the
research was into automatic type inference. I've always found the first
motivation statically-typed code to be comprehensibility, and found the
emphasis on extreme type-inference quite puzzling. Code I've seen in
languages that do this I find even more puzzling. No types to be seen
anywhere! No signposts as to meaning!
> Leroy started type-
> directed compilation closely followed by others (CMU etc). But a few
> years later, Leroy gave a rather skeptical talk at Types-in-
> Compilation, questioning whether much of it was worth the effort.
>
> Why is this the case? Imagine a function like that:
>
> (Lambda ([t : Type]) (lambda ([x : t]) x))
>
> What do you know about the arguments that flow into x? Until you know
> t, you know nothing. So how do you compile this? Perhaps you put the
> actual value into a pointer and just uniformly accept pointers. Easy
> compilation, easy run-time system. So then people tried to see
> whether the bug Lambda could be instantiated (cloned) so you have
> better memory usage and access in the little lambda. And I think
> we're stuck right there.
>
> Stalin is a whole-program compiler, not an incremental compiler, not
> a JIT compiler, and not a modular compiler.
Do you have a reference to Stalin?
> Unless your whole program
> analyses run in linear time, compiling 300,000 lines of code just
> isn't realistic. For higher-order functions, reasonable analyses
> aren't linear.
>
> This is not a small field, and this is not something where you can
> expect huge progress over the next year or so. Lots of people have
> tried and lots of man-years are in related projects.
>
> At this moment, Typed Scheme imposes a small run-time overhead (due
> to the contracts between typed and untyped modules) and that's that.
I think Scheme should be implemented on top of Typed Scheme, so to
speak, not the other way around.
Here's my design sketch:
(1) Implement a conventional garbage-collected strongly-typed secure
language (possibly with Lots of Irritating Single Parentheses, if
desired). Make sure it has several features:
-- a data type ANY that can be investigated at run time. Perhaps the
way to do this is by having a type REFERENCE-TO-ANY that points both to
anything and to a type-descriptor.
-- Put the stack on the heap.
Do all the things such compilers usually do, such as efficient storage
for bounded integers, arrays of reals, strings, and the like. This is
a big project, unless you can extend an existing code base. (I'm
currently looking at restoring a 70's era Algol 68 compiler, which I
think might be suitable for such a project).
(2) Use that to interpret/JITcompile something more like Scheme.
The simplest model would seem to be to make your Scheme be like the
underlying language except that (a) everything is of type
REFERENCE-TO-ANY, and (b) you don't mention that or any other type in
the code, and (c) it implicitly separates operations on REFERENCE-TO-ANY
into the possible actual kinds of data referred to, instead of requiring
the programmer to do it explicitly.
(3) Merge the two languages into one.
This si possible, although there are snags with it. The second language
would treat REFERENCE-TO-ANY quite differently from the first, so they
would not be naively compatible. The big consequence is that
declaring types changes semantics. In particular, the type FLOAT is not
the same as a type REFERENCE-TO-ANY which I have deduced happens to be a
FLOAT underneath. They are stored differently, they are represented
differently. But that doesn't mean that you can't have type-testing
operations that extract a FLOAT from a REFERENCE-TO_ANY and bind it to
a name that the typed language recognises as being of type FLOAT rather
than REFERENCE-TO-ANY-which-happens-to-refer-to-FLOAT.
So the typed and untyped sublanguages will be able to relate to one
another, will be compilable with one compiler, and it won't be a major
burden to incrementally specify types. You could even have a type
REFERENCE-TO-ANY-which-happens-to-refer-to-FLOAT if you want.
-- hendrik
>
> -- Matthias
>
>
> _________________________________________________
> For list-related administrative tasks:
> http://list.cs.brown.edu/mailman/listinfo/plt-scheme