[racket-dev] cross-module function inlining

From: Carl Eastlund (cce at ccs.neu.edu)
Date: Thu Dec 1 09:54:36 EST 2011

On Thu, Dec 1, 2011 at 9:49 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> The bytecode compiler now supports cross-module inlining of functions.
> As a result, for example, `empty?' and `cons?' should now perform just
> as well as `null?' and `pair?'.

Excellent!

> To avoid expanding bytecode too much, the compiler is especially
> conservative about which functions it chooses as candidates for
> cross-module inlining. For now, the function body must be very small
> --- roughly, less than 8+N expressions for a function of N arguments.
>
> Based on that size constraint, the compiler would not automatically
> determine that `map', `for-each', `andmap' and `ormap' are good
> candidates for inlining. Those functions have been annotated to
> encourage the compiler to make them candidates for inlining, anyway.
> You can similarly annotate your own functions using the pattern
>
>  (define-values (<id>)
>    (begin
>      'compiler-hint:cross-module-inline
>      <proc-expr>))
>
> Yes, this pattern is a hack; I don't have a better idea for the
> annotation at the moment, but it may change.

This seems like the kind of thing we normally use syntax properties
for.  How about we use that symbol as a property key, and make a macro
that adds it appropriately?  Or are syntax properties not the right
tool here?

> Given an imported function that is a candidate for inlining, the usual
> heuristics apply at a call site to determine whether the function is
> actually inlined. The heuristics should invariably allow functions like
> `empty?' to be inlined, but `map' may or may not be inlined for a given
> use --- depending, for example, on how much inlining has already
> happened at the call site.
>
>
> I have not yet found any useful programs that benefit immediately from
> this improvement. (Some traditional Scheme benchmarks benefit from
> inlining `map', of course.) The benefits are probably down the road, as
> various little parts of Racket shift to take advantage of the
> improvement.
>
>
> As always, you can use `raco decompile' to see whether a given function
> call was inlined. To check whether the compiler made a particular
> exported function a candidate for inlining, look for
>
>  (define-values (<id>)
>      (begin
>        '%%inline-variant%%
>        <proc-1>
>        <proc-2>))
>
> in decompiled output; the '%%inline-variant%% pattern reports that <id>
> is a candidate for inlining, and <proc-1> is the variant of the
> function that is used for inlining, while <proc-2> is the normal
> variant of the function. (The <proc-1> and <proc-2> code may be the
> same, or `<proc-1> may be less optimized in ways that keep its code
> smaller and easier to inline.)
>
>
> The current implementation of cross-module function inlining is just a
> first cut. If you try it and don't get the kind of inlining that you
> want or expect, let me know, and we can see whether improvements are in
> order.
>
> As an example, given the definitions
>
>  (define (f x) <something-big>)
>  (define (g y) (f y))
>
> a call to the `g' function not actually be inlined, even though `g' is
> considered a candidate for inlining. The inliner doesn't currently know
> how to move the reference to `f' into a different module when inlining
> `g'. This limitation isn't difficult to fix, I think, but it hasn't
> come up in the examples that I looked at, so I haven't tried to fix it.



Posted on the dev mailing list.