[plt-scheme] LLVM

From: hendrik at topoi.pooq.com (hendrik at topoi.pooq.com)
Date: Tue Jun 2 10:48:17 EDT 2009

On Tue, Jun 02, 2009 at 09:19:26AM -0400, Eli Barzilay wrote:
> On Jun  2, Tom Schouten wrote:
> > I was wondering if there is anyone working on a bridge between
> > LLVM and (PLT) Scheme.
> > 
> > More specifically I'm thinking about run-time code generation
> > (i.e. an embedded real-time DSL) and FFI to to the generated code.
> > 
> > Also, what happened to the idea of using LLVM for the PLT JIT?
> 
> There were several problems with the project -- one was that LLVM is
> huge, and adding it to PLT is roughly on the same scale as adding GCC.
> After we've played with it for a while, Matthew did the mzscheme
> jitter using lightning -- which is much more light weight, but also
> (not surprisingly) leaves mzscheme with the burden of doing more
> optimizations itself.  Having that done, there was little point in
> continguing the fight with LLVM, which had a few other issues at the
> time.  (My guess is that using LLVM would make things faster, given
> the huge effort that went into making it an optimizing compiler.)
> 
> Implementing an LLVM interface, however, should be very easy -- after
> spending some time looking at the various options, our conclusion was
> that it's really just easier to invoke LLVM on whole files (in the
> LLVM language), and using the LLVM jit to create an executable piece
> of code.  The first instinct was to use the LLVM commands directly,
> but that doesn't really have any advantage over using whole files.

I considered using LLVM to replace the code generator of a legacy 
compiler, one that previously generated IBM 360 assembler with a lot of 
macro-time goto's to handle the issued of having to generated code in a 
different order from what the assembler demanded.  I gave that up, 
partly because, as you say, LLVM is HUGE, but mostly because it too, 
even in its alluring parse tree interface, demanded that some crucial 
things be done in an order inconsitent with the data flow of the 
compiler.  In particular, it demanded that a structure type be 
completely defined before you could build parse trees for code that 
declares variables of that type.  This seems a little too strong a 
requirement at parse-tree-building time, and it killed my 
code-generatong algorithms.

I ended up picking C minus minus, which has its own problems (limited 
support and currently a small set of target machine architectures), but 
was small and comprehensible.  To deal with out-of-order I had the 
compiler use a "string-with-insertion-points" data structure, which is 
essentially a mutable tree of text snippets, which is eventually 
treewalked to create an output file.  That's practical now.  It wasn't 
in the legacy era thirty years ago.

I suppose at this point I could use the same data structure to generate 
LLVM files out of order, too.

-- hendrik


Posted on the users mailing list.