[plt-scheme] R6RS Interoperability

Fri Jun 27 10:20:51 EDT 2008

On Fri, Jun 27, 2008 at 09:26:32AM -0400, Matthias Felleisen wrote:
> 
> On Jun 27, 2008, at 2:35 AM, Mark Engelberg wrote:
> 
> >Since performance is a key goal of gradual typing,
> 
> Who says?
> 
> ;; ---
> 
> In case people are interested, I can try to compile a rather  
> extensive literature list on compiling w/ or w/o types for languages  
> such as ML and friends.

That might be interesting.  More interesting would be a readable summary 
of what worked and didn't.

> Just as a teaser; until 12 years ago (or so),  
> compilers for polymorphic languages just stripped types away once  
> they were done and used Scheme-like back-ends.

Perfectly reasonable if types were seen as a way to achieve correctness 
rather than speed.  And, if I recall correctly, the main thrust of the 
research was into automatic type inference.  I've always found the first 
motivation statically-typed code to be comprehensibility, and found the 
emphasis on extreme type-inference quite puzzling.  Code I've seen in 
languages that do this I find even more puzzling.  No types to be seen 
anywhere!  No signposts as to meaning!

> Leroy started type- 
> directed compilation closely followed by others (CMU etc). But a few  
> years later, Leroy gave a rather skeptical talk at Types-in- 
> Compilation, questioning whether much of it was worth the effort.
> 
> Why is this the case? Imagine a function like that:
> 
> 	(Lambda ([t : Type]) (lambda ([x : t]) x))
> 
> What do you know about the arguments that flow into x? Until you know  
> t, you know nothing. So how do you compile this? Perhaps you put the  
> actual value into a pointer and just uniformly accept pointers. Easy  
> compilation, easy run-time system. So then people tried to see  
> whether the bug Lambda could be instantiated (cloned) so you have  
> better memory usage and access in the little lambda.  And I think  
> we're stuck right there.
> 
> Stalin is a whole-program compiler, not an incremental compiler, not  
> a JIT compiler, and not a modular compiler.

Do you have a reference to Stalin?

> Unless your whole program  
> analyses run in linear time, compiling 300,000 lines of code just  
> isn't realistic. For higher-order functions, reasonable analyses  
> aren't linear.
> 
> This is not a small field, and this is not something where you can  
> expect huge progress over the next year or so. Lots of people have  
> tried and lots of man-years are in related projects.
> 
> At this moment, Typed Scheme imposes a small run-time overhead (due  
> to the contracts between typed and untyped modules) and that's that.

I think Scheme should be implemented on top of Typed Scheme, so to 
speak, not the other way around.

Here's my design sketch:

(1) Implement a conventional garbage-collected strongly-typed secure 
language (possibly with Lots of Irritating Single Parentheses, if 
desired).  Make sure it has several features:

  -- a data type ANY that can be investigated at run time.  Perhaps the 
way to do this is by having a type REFERENCE-TO-ANY that points both to 
anything and to a type-descriptor.

  -- Put the stack on the heap.

Do all the things such compilers usually do, such as efficient storage 
for bounded integers, arrays of reals, strings,  and the like.  This is 
a big project, unless you can extend an existing code base.  (I'm 
currently looking at restoring a 70's era Algol 68 compiler, which I 
think might be suitable for such a project).

(2) Use that to interpret/JITcompile something more like Scheme.  
The simplest model would seem to be to make your Scheme be like the 
underlying language except that (a) everything is of type 
REFERENCE-TO-ANY, and (b) you don't mention that or any other type in 
the code, and (c) it implicitly separates operations on REFERENCE-TO-ANY 
into the possible actual kinds of data referred to, instead of requiring 
the programmer to do it explicitly.

(3) Merge the two languages into one.

This si possible, although there are snags with it.  The second language 
would treat REFERENCE-TO-ANY quite differently from the first, so they 
would not be naively compatible.  The big consequence is that 
declaring types changes semantics.  In particular, the type FLOAT is not 
the same as a type REFERENCE-TO-ANY which I have deduced happens to be a 
FLOAT underneath.  They are stored differently, they are represented 
differently.  But that doesn't mean that you can't have type-testing 
operations that extract a FLOAT from a REFERENCE-TO_ANY and bind it to 
a name that the typed language recognises as being of type FLOAT rather 
than REFERENCE-TO-ANY-which-happens-to-refer-to-FLOAT.

So the typed and untyped sublanguages will be able to relate to one 
another, will be compilable with one compiler, and it won't be a major 
burden to incrementally specify types.  You could even have a type
REFERENCE-TO-ANY-which-happens-to-refer-to-FLOAT if you want.

-- hendrik

> 
> -- Matthias
> 
> 
> _________________________________________________
>  For list-related administrative tasks:
>  http://list.cs.brown.edu/mailman/listinfo/plt-scheme