[racket-dev] [plt] Push #27909: master branch updated

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Tue Jan 14 12:33:16 EST 2014

Great, thanks!

It occurred to me this morning that there is probably a significant
slowdown for types that turn into case-> instead of just -> (due to the
contract system).

I would like to test this hypothesis, but I'm not sure how to. Do you mind
posting a little program, like the ones below that generates a case-> so we
can play around with it to see?

Robby



On Tue, Jan 14, 2014 at 11:27 AM, Eric Dobson <eric.n.dobson at gmail.com>wrote:

> The changes to TR contract generation are now in at HEAD.
>
> If you can find any cases where contracts are still slowing programs
> down by a lot I'd like to take a look at them. (Excepting the case of
> structs where I know it is still an issue).
>
> On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler
> <robby at eecs.northwestern.edu> wrote:
> > Re-reading your message I see that you're not actually asserting
> something
> > different from what I said, but just for some precision here I wish to
> point
> > out that I wasn't basing my opinion on intuition from the code, but on
> some
> > microbenchmark timings. (There was a much more substantial difference
> > yesterday because the loop inside any-wrap/c wasn't as cheap as it could
> > have been.)
> >
> > I'd be interested to see if your improvements to type->contract improve
> the
> > situation any! I expect they will make things better again for the Number
> > case, but at the moment, there isn't a big difference.
> >
> > Program 1:
> >
> >
> > #lang racket/base
> > (module m typed/racket/base
> >   (: f (Any -> Any))
> >   (define (f x) 1)
> >   (provide f))
> > (require 'm)
> > (time
> >  (for ([x (in-range 20000)])
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))
> >
> > Timings:
> >
> > cpu time: 142 real time: 142 gc time: 8
> > cpu time: 144 real time: 144 gc time: 7
> > cpu time: 144 real time: 143 gc time: 6
> > cpu time: 142 real time: 142 gc time: 6
> > cpu time: 142 real time: 142 gc time: 7
> > cpu time: 146 real time: 146 gc time: 6
> >
> > Program 2:
> >
> >
> > #lang racket/base
> > (module m typed/racket/base
> >   (: f (Any -> Integer))
> >
> >   (define (f x) 1)
> >   (provide f))
> > (require 'm)
> > (time
> >  (for ([x (in-range 20000)])
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))
> >
> >
> > Timings:
> >
> > cpu time: 139 real time: 138 gc time: 7
> > cpu time: 145 real time: 144 gc time: 7
> > cpu time: 140 real time: 140 gc time: 6
> > cpu time: 151 real time: 150 gc time: 6
> > cpu time: 139 real time: 138 gc time: 6
> > cpu time: 139 real time: 139 gc time: 8
> >
> >
> >
> >
> >
> >
> >
> >
> > On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <eric.n.dobson at gmail.com>
> > wrote:
> >>
> >> any-wrap/c still requires the check for one value, while any (which is
> >> from Number not Any) does not. So I would still guess at Number being
> >> faster, but Robby's changes may make it so that inlining and dead code
> >> elimination can see through everything and turn it into the same code.
> >>
> >> On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler
> >> <robby at eecs.northwestern.edu> wrote:
> >> > FWIW, my push speeds up the any-wrap/c implementation a bunch. Those
> two
> >> > should have similar speeds after you get that, I guess.
> >> >
> >> > Robby
> >> >
> >> >
> >> > On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <
> neil.toronto at gmail.com>
> >> > wrote:
> >> >>
> >> >> I tried your branch that implements it and saw about 3.5x speedup for
> >> >> the
> >> >> `magnitude*' test. This is of course without Robby's recent
> first-order
> >> >> contract changes.
> >> >>
> >> >> (I think it's about 3.5x: I tried with magnitude* : Number -> Any
> first
> >> >> and got 2400ms on the easy tests. I changed it to magnitude* : Number
> >> >> ->
> >> >> Number and got 690ms. Apparently, for an `Any' return type, an
> >> >> `any-wrap/c'
> >> >> contract is generated instead of nothing. If that's much faster to
> >> >> check
> >> >> than `number?', though, the speedup is even better.)
> >> >>
> >> >> I'd love to see this with Robby's recent changes. Hint? Nudge?
> Please?
> >> >>
> >> >> I didn't see very much speedup with arrays (about 1.2x). Speed tests
> on
> >> >> the math library's distribution objects were very interesting,
> though,
> >> >> and
> >> >> indicate why the arrays might not be much faster. Here's my test
> >> >> program:
> >> >>
> >> >>
> >> >> #lang racket
> >> >>
> >> >> (require math/distributions)
> >> >>
> >> >> (define d (normal-dist 0 1))
> >> >>
> >> >> (printf "pdf d 0~n")
> >> >> (for ([_  (in-range 5)])
> >> >>   (time (for ([_  (in-range 100000)])
> >> >>           (pdf d 0))))
> >> >> (newline)
> >> >>
> >> >> (define p (distribution-pdf d))
> >> >> (printf "p 0~n")
> >> >> (for ([_  (in-range 5)])
> >> >>   (time (for ([_  (in-range 100000)])
> >> >>           (p 0))))
> >> >> (newline)
> >> >>
> >> >>
> >> >> The two tests are equivalent, as `pdf' just pulls the pdf function
> out
> >> >> of
> >> >> the distribution struct and applies it. In TR, the tests are exactly
> >> >> the
> >> >> same speed (extremely fast). In untyped Racket, on the main branch,
> the
> >> >> second test is 16x faster, and on your branch, it's 44x faster. (It's
> >> >> still
> >> >> 10x slower than TR on your branch, so again... I'd love to see your
> >> >> changes
> >> >> and Robby's together. :D)
> >> >>
> >> >> Neil ⊥
> >> >>
> >> >>
> >> >> On 12/12/2013 12:40 AM, Eric Dobson wrote:
> >> >>>
> >> >>> Removing the return value checking is in the works. It actually is
> >> >>> removing all of the checks that would blame typed code, so higher
> >> >>> order functions/datastructure get improvements too. It is actually
> >> >>> functional the last time I checked, but lacking documentation which
> is
> >> >>> what is holding up merging with mainline.
> >> >>>
> >> >>> https://github.com/plt/racket/pull/453
> >> >>>
> >> >>> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler
> >> >>> <robby at eecs.northwestern.edu> wrote:
> >> >>>>
> >> >>>> I see that TR's type->contract returns
> >> >>>>
> >> >>>>   (-> (flat-named-contract (quote Float) flonum?)
> >> >>>> (flat-named-contract
> >> >>>> (quote
> >> >>>> Float) flonum?))
> >> >>>>
> >> >>>> for the type (Float -> Float), but it could return
> >> >>>>
> >> >>>>   (-> (flat-named-contract (quote Float) flonum?) any)
> >> >>>>
> >> >>>> which wouldn't do any result value checking (this being different
> >> >>>> from
> >> >>>> any/c
> >> >>>> as the range of the arrow contract).
> >> >>>>
> >> >>>> Robby
> >> >>>>
> >> >>>>
> >> >>>> On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto
> >> >>>> <neil.toronto at gmail.com>
> >> >>>> wrote:
> >> >>>>>
> >> >>>>>
> >> >>>>> On 12/11/2013 02:49 PM, Neil Toronto wrote:
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On 12/11/2013 01:55 PM, Stephen Bloch wrote:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote:
> >> >>>>>>>>
> >> >>>>>>>>> numeric primitives implemented in Typed Racket are faster than
> >> >>>>>>>>> the
> >> >>>>>>>>> same primitives implemented in C.
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> Whoa!  How did that happen?
> >> >>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> Whoa! That's not what I meant! O_o
> >> >>>>>>
> >> >>>>>> I said "we might be getting close" to that. I haven't tried
> porting
> >> >>>>>> a
> >> >>>>>> numeric C primitive to TR yet, but I have a hunch that it'll
> still
> >> >>>>>> be
> >> >>>>>> slower. I'll try one now and report what I find.
> >> >>>>>>
> >> >>>>>> Neil ⊥
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> I can't figure out why `flsinh' is faster to call from untyped
> >> >>>>> Racket
> >> >>>>> than
> >> >>>>> `sinh'. All my tests with a Typed Racket `magnitude' show calls
> from
> >> >>>>> untyped
> >> >>>>> code are significantly slower, except in the one case that it
> >> >>>>> computes
> >> >>>>> Euclidean distance. That case is only twice as slow.
> >> >>>>>
> >> >>>>> I've attached the benchmark program. The `magnitude*' function is
> >> >>>>> more
> >> >>>>> or
> >> >>>>> less a direct translation of `magnitude' from "number.c" into
> Typed
> >> >>>>> Racket.
> >> >>>>> Here's a summary of the results I get on my computer, in
> >> >>>>> milliseconds,
> >> >>>>> for 5
> >> >>>>> million calls from untyped Racket, by data type.
> >> >>>>>
> >> >>>>>
> >> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
> >> >>>>>
> -------------------------------------------------------------------
> >> >>>>> magnitude*         385      419      378     414         686
> >> >>>>> magnitude           59       44       40      40         390
> >> >>>>>
> >> >>>>>
> >> >>>>> The only one that's close in relative terms is Float-Complex. The
> >> >>>>> others
> >> >>>>> just call `abs'. The decompiled code doesn't show any inlining of
> >> >>>>> `magnitude', so this comparison should be good.
> >> >>>>>
> >> >>>>> I'll bet checking the return value contract (which is unnecessary)
> >> >>>>> is
> >> >>>>> the
> >> >>>>> main slowdown. It has to check for number of values.
> >> >>>>>
> >> >>>>> For comparison, here are the timings for running the benchmarks in
> >> >>>>> TR
> >> >>>>> with
> >> >>>>> #:no-optimize:
> >> >>>>>
> >> >>>>>
> >> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
> >> >>>>>
> -------------------------------------------------------------------
> >> >>>>> magnitude*          45       70*      37     102*       318
> >> >>>>> magnitude           61       45       39      91*       394
> >> >>>>>
> >> >>>>>                                                * = unexpectedly
> high
> >> >>>>>
> >> >>>>>
> >> >>>>> Here's what I understand from comparing the numbers:
> >> >>>>>
> >> >>>>>   * Except for non-fixnum integers, calling `magnitude' in TR is
> >> >>>>> just
> >> >>>>> as
> >> >>>>> fast as in untyped Racket. I have no idea why it would be slower
> on
> >> >>>>> big
> >> >>>>> integers. That's just weird.
> >> >>>>>
> >> >>>>>   * Calling `abs' in Racket is faster than calling `scheme_abs' in
> >> >>>>> C,
> >> >>>>> except on rationals and big integers.
> >> >>>>>
> >> >>>>>   * Operating on flonums in Typed Racket, using generic numeric
> >> >>>>> functions,
> >> >>>>> is faster than doing the same in C.
> >> >>>>>
> >> >>>>> Overall, it looks like the TR code is within the same order of
> >> >>>>> magnitude
> >> >>>>> (pun not intended) as the C code. I would love to try this
> benchmark
> >> >>>>> with
> >> >>>>> either 1) a `magnitude*' with an `AnyValues' return type; or 2) a
> >> >>>>> contract
> >> >>>>> boundary that doesn't check TR's return types for first-order
> >> >>>>> functions.
> >> >>>>>
> >> >>>>> (I managed to make a `magnitude*' with type Number -> AnyValues,
> but
> >> >>>>> TR
> >> >>>>> couldn't make a contract for it.)
> >> >>>>>
> >> >>>>> Neil ⊥
> >> >>>>>
> >> >>>>>
> >> >>>>> _________________________
> >> >>>>>    Racket Developers list:
> >> >>>>>    http://lists.racket-lang.org/dev
> >> >>>>>
> >> >>>>
> >> >>>> _________________________
> >> >>>>    Racket Developers list:
> >> >>>>    http://lists.racket-lang.org/dev
> >> >>>>
> >> >>
> >> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/dev/archive/attachments/20140114/f0255b2b/attachment-0001.html>

Posted on the dev mailing list.