[racket-dev] [plt] Push #27909: master branch updated

From: Eric Dobson (eric.n.dobson at gmail.com)
Date: Tue Jan 14 12:27:32 EST 2014

The changes to TR contract generation are now in at HEAD.

If you can find any cases where contracts are still slowing programs
down by a lot I'd like to take a look at them. (Excepting the case of
structs where I know it is still an issue).

On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler
<robby at eecs.northwestern.edu> wrote:
> Re-reading your message I see that you're not actually asserting something
> different from what I said, but just for some precision here I wish to point
> out that I wasn't basing my opinion on intuition from the code, but on some
> microbenchmark timings. (There was a much more substantial difference
> yesterday because the loop inside any-wrap/c wasn't as cheap as it could
> have been.)
>
> I'd be interested to see if your improvements to type->contract improve the
> situation any! I expect they will make things better again for the Number
> case, but at the moment, there isn't a big difference.
>
> Program 1:
>
>
> #lang racket/base
> (module m typed/racket/base
>   (: f (Any -> Any))
>   (define (f x) 1)
>   (provide f))
> (require 'm)
> (time
>  (for ([x (in-range 20000)])
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))
>
> Timings:
>
> cpu time: 142 real time: 142 gc time: 8
> cpu time: 144 real time: 144 gc time: 7
> cpu time: 144 real time: 143 gc time: 6
> cpu time: 142 real time: 142 gc time: 6
> cpu time: 142 real time: 142 gc time: 7
> cpu time: 146 real time: 146 gc time: 6
>
> Program 2:
>
>
> #lang racket/base
> (module m typed/racket/base
>   (: f (Any -> Integer))
>
>   (define (f x) 1)
>   (provide f))
> (require 'm)
> (time
>  (for ([x (in-range 20000)])
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))
>
>
> Timings:
>
> cpu time: 139 real time: 138 gc time: 7
> cpu time: 145 real time: 144 gc time: 7
> cpu time: 140 real time: 140 gc time: 6
> cpu time: 151 real time: 150 gc time: 6
> cpu time: 139 real time: 138 gc time: 6
> cpu time: 139 real time: 139 gc time: 8
>
>
>
>
>
>
>
>
> On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <eric.n.dobson at gmail.com>
> wrote:
>>
>> any-wrap/c still requires the check for one value, while any (which is
>> from Number not Any) does not. So I would still guess at Number being
>> faster, but Robby's changes may make it so that inlining and dead code
>> elimination can see through everything and turn it into the same code.
>>
>> On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler
>> <robby at eecs.northwestern.edu> wrote:
>> > FWIW, my push speeds up the any-wrap/c implementation a bunch. Those two
>> > should have similar speeds after you get that, I guess.
>> >
>> > Robby
>> >
>> >
>> > On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <neil.toronto at gmail.com>
>> > wrote:
>> >>
>> >> I tried your branch that implements it and saw about 3.5x speedup for
>> >> the
>> >> `magnitude*' test. This is of course without Robby's recent first-order
>> >> contract changes.
>> >>
>> >> (I think it's about 3.5x: I tried with magnitude* : Number -> Any first
>> >> and got 2400ms on the easy tests. I changed it to magnitude* : Number
>> >> ->
>> >> Number and got 690ms. Apparently, for an `Any' return type, an
>> >> `any-wrap/c'
>> >> contract is generated instead of nothing. If that's much faster to
>> >> check
>> >> than `number?', though, the speedup is even better.)
>> >>
>> >> I'd love to see this with Robby's recent changes. Hint? Nudge? Please?
>> >>
>> >> I didn't see very much speedup with arrays (about 1.2x). Speed tests on
>> >> the math library's distribution objects were very interesting, though,
>> >> and
>> >> indicate why the arrays might not be much faster. Here's my test
>> >> program:
>> >>
>> >>
>> >> #lang racket
>> >>
>> >> (require math/distributions)
>> >>
>> >> (define d (normal-dist 0 1))
>> >>
>> >> (printf "pdf d 0~n")
>> >> (for ([_  (in-range 5)])
>> >>   (time (for ([_  (in-range 100000)])
>> >>           (pdf d 0))))
>> >> (newline)
>> >>
>> >> (define p (distribution-pdf d))
>> >> (printf "p 0~n")
>> >> (for ([_  (in-range 5)])
>> >>   (time (for ([_  (in-range 100000)])
>> >>           (p 0))))
>> >> (newline)
>> >>
>> >>
>> >> The two tests are equivalent, as `pdf' just pulls the pdf function out
>> >> of
>> >> the distribution struct and applies it. In TR, the tests are exactly
>> >> the
>> >> same speed (extremely fast). In untyped Racket, on the main branch, the
>> >> second test is 16x faster, and on your branch, it's 44x faster. (It's
>> >> still
>> >> 10x slower than TR on your branch, so again... I'd love to see your
>> >> changes
>> >> and Robby's together. :D)
>> >>
>> >> Neil ⊥
>> >>
>> >>
>> >> On 12/12/2013 12:40 AM, Eric Dobson wrote:
>> >>>
>> >>> Removing the return value checking is in the works. It actually is
>> >>> removing all of the checks that would blame typed code, so higher
>> >>> order functions/datastructure get improvements too. It is actually
>> >>> functional the last time I checked, but lacking documentation which is
>> >>> what is holding up merging with mainline.
>> >>>
>> >>> https://github.com/plt/racket/pull/453
>> >>>
>> >>> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler
>> >>> <robby at eecs.northwestern.edu> wrote:
>> >>>>
>> >>>> I see that TR's type->contract returns
>> >>>>
>> >>>>   (-> (flat-named-contract (quote Float) flonum?)
>> >>>> (flat-named-contract
>> >>>> (quote
>> >>>> Float) flonum?))
>> >>>>
>> >>>> for the type (Float -> Float), but it could return
>> >>>>
>> >>>>   (-> (flat-named-contract (quote Float) flonum?) any)
>> >>>>
>> >>>> which wouldn't do any result value checking (this being different
>> >>>> from
>> >>>> any/c
>> >>>> as the range of the arrow contract).
>> >>>>
>> >>>> Robby
>> >>>>
>> >>>>
>> >>>> On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto
>> >>>> <neil.toronto at gmail.com>
>> >>>> wrote:
>> >>>>>
>> >>>>>
>> >>>>> On 12/11/2013 02:49 PM, Neil Toronto wrote:
>> >>>>>>
>> >>>>>>
>> >>>>>> On 12/11/2013 01:55 PM, Stephen Bloch wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote:
>> >>>>>>>>
>> >>>>>>>>> numeric primitives implemented in Typed Racket are faster than
>> >>>>>>>>> the
>> >>>>>>>>> same primitives implemented in C.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Whoa!  How did that happen?
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> Whoa! That's not what I meant! O_o
>> >>>>>>
>> >>>>>> I said "we might be getting close" to that. I haven't tried porting
>> >>>>>> a
>> >>>>>> numeric C primitive to TR yet, but I have a hunch that it'll still
>> >>>>>> be
>> >>>>>> slower. I'll try one now and report what I find.
>> >>>>>>
>> >>>>>> Neil ⊥
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> I can't figure out why `flsinh' is faster to call from untyped
>> >>>>> Racket
>> >>>>> than
>> >>>>> `sinh'. All my tests with a Typed Racket `magnitude' show calls from
>> >>>>> untyped
>> >>>>> code are significantly slower, except in the one case that it
>> >>>>> computes
>> >>>>> Euclidean distance. That case is only twice as slow.
>> >>>>>
>> >>>>> I've attached the benchmark program. The `magnitude*' function is
>> >>>>> more
>> >>>>> or
>> >>>>> less a direct translation of `magnitude' from "number.c" into Typed
>> >>>>> Racket.
>> >>>>> Here's a summary of the results I get on my computer, in
>> >>>>> milliseconds,
>> >>>>> for 5
>> >>>>> million calls from untyped Racket, by data type.
>> >>>>>
>> >>>>>
>> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
>> >>>>> -------------------------------------------------------------------
>> >>>>> magnitude*         385      419      378     414         686
>> >>>>> magnitude           59       44       40      40         390
>> >>>>>
>> >>>>>
>> >>>>> The only one that's close in relative terms is Float-Complex. The
>> >>>>> others
>> >>>>> just call `abs'. The decompiled code doesn't show any inlining of
>> >>>>> `magnitude', so this comparison should be good.
>> >>>>>
>> >>>>> I'll bet checking the return value contract (which is unnecessary)
>> >>>>> is
>> >>>>> the
>> >>>>> main slowdown. It has to check for number of values.
>> >>>>>
>> >>>>> For comparison, here are the timings for running the benchmarks in
>> >>>>> TR
>> >>>>> with
>> >>>>> #:no-optimize:
>> >>>>>
>> >>>>>
>> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
>> >>>>> -------------------------------------------------------------------
>> >>>>> magnitude*          45       70*      37     102*       318
>> >>>>> magnitude           61       45       39      91*       394
>> >>>>>
>> >>>>>                                                * = unexpectedly high
>> >>>>>
>> >>>>>
>> >>>>> Here's what I understand from comparing the numbers:
>> >>>>>
>> >>>>>   * Except for non-fixnum integers, calling `magnitude' in TR is
>> >>>>> just
>> >>>>> as
>> >>>>> fast as in untyped Racket. I have no idea why it would be slower on
>> >>>>> big
>> >>>>> integers. That's just weird.
>> >>>>>
>> >>>>>   * Calling `abs' in Racket is faster than calling `scheme_abs' in
>> >>>>> C,
>> >>>>> except on rationals and big integers.
>> >>>>>
>> >>>>>   * Operating on flonums in Typed Racket, using generic numeric
>> >>>>> functions,
>> >>>>> is faster than doing the same in C.
>> >>>>>
>> >>>>> Overall, it looks like the TR code is within the same order of
>> >>>>> magnitude
>> >>>>> (pun not intended) as the C code. I would love to try this benchmark
>> >>>>> with
>> >>>>> either 1) a `magnitude*' with an `AnyValues' return type; or 2) a
>> >>>>> contract
>> >>>>> boundary that doesn't check TR's return types for first-order
>> >>>>> functions.
>> >>>>>
>> >>>>> (I managed to make a `magnitude*' with type Number -> AnyValues, but
>> >>>>> TR
>> >>>>> couldn't make a contract for it.)
>> >>>>>
>> >>>>> Neil ⊥
>> >>>>>
>> >>>>>
>> >>>>> _________________________
>> >>>>>    Racket Developers list:
>> >>>>>    http://lists.racket-lang.org/dev
>> >>>>>
>> >>>>
>> >>>> _________________________
>> >>>>    Racket Developers list:
>> >>>>    http://lists.racket-lang.org/dev
>> >>>>
>> >>
>> >


Posted on the dev mailing list.