[plt-scheme] Statistics (V301.5 Speed Up)
On Feb 13, Jim Blandy wrote:
> On 2/13/06, Alexander Friedman <alex at inga.mit.edu> wrote:
> > > What does the code look like?
> >
> > Which code in particular? The benchmarks look nice, the compiler not
> > so much.
>
> Sorry --- the generated code. That is, you've got a
> production-quality code generator there, but you're only getting what
> looks like an average 100% speedup over a bytecode interpreter. So
> there's a lot of room for improvement.
Not quite - MzScheme as of 301.? has a simple jit compiler based on
GNU lightning. So the speedup is vs that compiler, not the bytecode
interpreter. The speedup over the interpreter was usually 10 - 15x.
> It's my guess that llvm is having a hard time teasing out what's
> really necessary computation from the intermediate code you're feeding
> it as input. I'd expect to see things like:
> - a lack of inlining obscuring dataflow and control flow
Common mzscheme prims are inlined, but many are not. The front-end
itself does some inlining of known-local fucntions.
> - run-time type checks obscuring control flow
Likely - type checks are seperated from the operations, and many are
eliminated with CSE. However, because mzscheme is multi-threaded, this
is limited.
> - over-generalized environment representations preventing llvm from
> using registers and simplifying frame structure
All mutable bindings are allocated on the heap (as a cons cell per
var currently...). Closures are linked, and always on the heap as
well. Other suggestions are welcome :)
> - generic arithmetic preventing arithmetic optimizations (not that
> anybody really knows how to fix this)
Yes. I don't see any obvious way to fix this either.
> But this is all wild speculation. If we had some generated code to
> look over, we could really see what it's spending all that time on.
If you are knowledgeable about compilers (I am not) have time and are
interested in helping, send me an email off-list.
> Please don't misunderstand what I'm writing here as criticism or a
> put-down --- I'd expect even the most well-founded attempt to generate
> native code to begin life in a state somewhat like this. If I'm
> remembering right, a friend of mine once produced a native code
> generator for Scheme that was *slower* than the bytecode
> interpreter.
MzC often produced code slower than the interpreter for indirect tail
calls because of the C trampoline. It shouldn't be hard to beat an
interpreter otherwise.
--
-Alex