[racket-dev] better x86 performance

From: Eli Barzilay (eli at barzilay.org)
Date: Sun Apr 24 21:05:15 EDT 2011

Two minutes ago, Robby Findler wrote:
> On Sun, Apr 24, 2011 at 7:56 PM, Eli Barzilay <eli at barzilay.org> wrote:
> > An hour and a half ago, Matthew Flatt wrote:
> >>
> >> [...] Later, the `ret' to return from the non-tail call would
> >> confuse the processor and caused stalls, because the `ret' it wasn't
> >> matched with its `call'.  It's easy enough to put the return address
> >> in place using `call' when setting up a frame, which exposes the
> >> right nesting to the processor.
> >
> > Does this mean that the code was correct, only it followed a pattern
> > that is not commonly produced by most compilers?
> 
> Yes, except that the issue here is branch (jump) prediction not so
> much the fact that compilers commonly produce call/ret pairs. That
> is, the processor can do a much better job of keeping things running
> fast when it can predict which instruction is going to come after
> the current one [...]

Oh right -- the main advantage is in prediction.

(I know about it, just didn't see the connection to it.)

(Also, this is a much more subtle point than Matthew's post made it
sound when he said that `call's are better paired with `ret's -- that
sounded like a "more real" bug.)

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!



Posted on the dev mailing list.