[racket-dev] better x86 performance
Two minutes ago, Robby Findler wrote:
> On Sun, Apr 24, 2011 at 7:56 PM, Eli Barzilay <eli at barzilay.org> wrote:
> > An hour and a half ago, Matthew Flatt wrote:
> >>
> >> [...] Later, the `ret' to return from the non-tail call would
> >> confuse the processor and caused stalls, because the `ret' it wasn't
> >> matched with its `call'. It's easy enough to put the return address
> >> in place using `call' when setting up a frame, which exposes the
> >> right nesting to the processor.
> >
> > Does this mean that the code was correct, only it followed a pattern
> > that is not commonly produced by most compilers?
>
> Yes, except that the issue here is branch (jump) prediction not so
> much the fact that compilers commonly produce call/ret pairs. That
> is, the processor can do a much better job of keeping things running
> fast when it can predict which instruction is going to come after
> the current one [...]
Oh right -- the main advantage is in prediction.
(I know about it, just didn't see the connection to it.)
(Also, this is a much more subtle point than Matthew's post made it
sound when he said that `call's are better paired with `ret's -- that
sounded like a "more real" bug.)
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://barzilay.org/ Maze is Life!