[racket-dev] [plt] Push #27909: master branch updated

From: Eric Dobson (eric.n.dobson at gmail.com)
Date: Tue Jan 14 12:46:08 EST 2014

Here is an example program. You need the step from 0 to 2 because TR
is smart enough to turn (case-> (-> R) (A -> R) (A B -> R)) into ->*.

#lang racket/load

(module m typed/racket
  (provide foo)

  (: foo
     (case->
       (-> Symbol)
       (Symbol Symbol -> Symbol)))
  (define (foo [x 'x] [y 'y]) x))

(require 'm)
(require (for-syntax syntax/parse))

(require racket/contract/private/provide)

(define-syntax (contract-on stx)
  (syntax-parse stx
    [(_ id)
     (provide/contract-info-contract-id (syntax-local-value #'id))]))

(displayln
  (contract-on foo))

On Tue, Jan 14, 2014 at 9:33 AM, Robby Findler
<robby at eecs.northwestern.edu> wrote:
> Great, thanks!
>
> It occurred to me this morning that there is probably a significant slowdown
> for types that turn into case-> instead of just -> (due to the contract
> system).
>
> I would like to test this hypothesis, but I'm not sure how to. Do you mind
> posting a little program, like the ones below that generates a case-> so we
> can play around with it to see?
>
> Robby
>
>
>
> On Tue, Jan 14, 2014 at 11:27 AM, Eric Dobson <eric.n.dobson at gmail.com>
> wrote:
>>
>> The changes to TR contract generation are now in at HEAD.
>>
>> If you can find any cases where contracts are still slowing programs
>> down by a lot I'd like to take a look at them. (Excepting the case of
>> structs where I know it is still an issue).
>>
>> On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler
>> <robby at eecs.northwestern.edu> wrote:
>> > Re-reading your message I see that you're not actually asserting
>> > something
>> > different from what I said, but just for some precision here I wish to
>> > point
>> > out that I wasn't basing my opinion on intuition from the code, but on
>> > some
>> > microbenchmark timings. (There was a much more substantial difference
>> > yesterday because the loop inside any-wrap/c wasn't as cheap as it could
>> > have been.)
>> >
>> > I'd be interested to see if your improvements to type->contract improve
>> > the
>> > situation any! I expect they will make things better again for the
>> > Number
>> > case, but at the moment, there isn't a big difference.
>> >
>> > Program 1:
>> >
>> >
>> > #lang racket/base
>> > (module m typed/racket/base
>> >   (: f (Any -> Any))
>> >   (define (f x) 1)
>> >   (provide f))
>> > (require 'm)
>> > (time
>> >  (for ([x (in-range 20000)])
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))
>> >
>> > Timings:
>> >
>> > cpu time: 142 real time: 142 gc time: 8
>> > cpu time: 144 real time: 144 gc time: 7
>> > cpu time: 144 real time: 143 gc time: 6
>> > cpu time: 142 real time: 142 gc time: 6
>> > cpu time: 142 real time: 142 gc time: 7
>> > cpu time: 146 real time: 146 gc time: 6
>> >
>> > Program 2:
>> >
>> >
>> > #lang racket/base
>> > (module m typed/racket/base
>> >   (: f (Any -> Integer))
>> >
>> >   (define (f x) 1)
>> >   (provide f))
>> > (require 'm)
>> > (time
>> >  (for ([x (in-range 20000)])
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
>> >    (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))
>> >
>> >
>> > Timings:
>> >
>> > cpu time: 139 real time: 138 gc time: 7
>> > cpu time: 145 real time: 144 gc time: 7
>> > cpu time: 140 real time: 140 gc time: 6
>> > cpu time: 151 real time: 150 gc time: 6
>> > cpu time: 139 real time: 138 gc time: 6
>> > cpu time: 139 real time: 139 gc time: 8
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <eric.n.dobson at gmail.com>
>> > wrote:
>> >>
>> >> any-wrap/c still requires the check for one value, while any (which is
>> >> from Number not Any) does not. So I would still guess at Number being
>> >> faster, but Robby's changes may make it so that inlining and dead code
>> >> elimination can see through everything and turn it into the same code.
>> >>
>> >> On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler
>> >> <robby at eecs.northwestern.edu> wrote:
>> >> > FWIW, my push speeds up the any-wrap/c implementation a bunch. Those
>> >> > two
>> >> > should have similar speeds after you get that, I guess.
>> >> >
>> >> > Robby
>> >> >
>> >> >
>> >> > On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto
>> >> > <neil.toronto at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> I tried your branch that implements it and saw about 3.5x speedup
>> >> >> for
>> >> >> the
>> >> >> `magnitude*' test. This is of course without Robby's recent
>> >> >> first-order
>> >> >> contract changes.
>> >> >>
>> >> >> (I think it's about 3.5x: I tried with magnitude* : Number -> Any
>> >> >> first
>> >> >> and got 2400ms on the easy tests. I changed it to magnitude* :
>> >> >> Number
>> >> >> ->
>> >> >> Number and got 690ms. Apparently, for an `Any' return type, an
>> >> >> `any-wrap/c'
>> >> >> contract is generated instead of nothing. If that's much faster to
>> >> >> check
>> >> >> than `number?', though, the speedup is even better.)
>> >> >>
>> >> >> I'd love to see this with Robby's recent changes. Hint? Nudge?
>> >> >> Please?
>> >> >>
>> >> >> I didn't see very much speedup with arrays (about 1.2x). Speed tests
>> >> >> on
>> >> >> the math library's distribution objects were very interesting,
>> >> >> though,
>> >> >> and
>> >> >> indicate why the arrays might not be much faster. Here's my test
>> >> >> program:
>> >> >>
>> >> >>
>> >> >> #lang racket
>> >> >>
>> >> >> (require math/distributions)
>> >> >>
>> >> >> (define d (normal-dist 0 1))
>> >> >>
>> >> >> (printf "pdf d 0~n")
>> >> >> (for ([_  (in-range 5)])
>> >> >>   (time (for ([_  (in-range 100000)])
>> >> >>           (pdf d 0))))
>> >> >> (newline)
>> >> >>
>> >> >> (define p (distribution-pdf d))
>> >> >> (printf "p 0~n")
>> >> >> (for ([_  (in-range 5)])
>> >> >>   (time (for ([_  (in-range 100000)])
>> >> >>           (p 0))))
>> >> >> (newline)
>> >> >>
>> >> >>
>> >> >> The two tests are equivalent, as `pdf' just pulls the pdf function
>> >> >> out
>> >> >> of
>> >> >> the distribution struct and applies it. In TR, the tests are exactly
>> >> >> the
>> >> >> same speed (extremely fast). In untyped Racket, on the main branch,
>> >> >> the
>> >> >> second test is 16x faster, and on your branch, it's 44x faster.
>> >> >> (It's
>> >> >> still
>> >> >> 10x slower than TR on your branch, so again... I'd love to see your
>> >> >> changes
>> >> >> and Robby's together. :D)
>> >> >>
>> >> >> Neil ⊥
>> >> >>
>> >> >>
>> >> >> On 12/12/2013 12:40 AM, Eric Dobson wrote:
>> >> >>>
>> >> >>> Removing the return value checking is in the works. It actually is
>> >> >>> removing all of the checks that would blame typed code, so higher
>> >> >>> order functions/datastructure get improvements too. It is actually
>> >> >>> functional the last time I checked, but lacking documentation which
>> >> >>> is
>> >> >>> what is holding up merging with mainline.
>> >> >>>
>> >> >>> https://github.com/plt/racket/pull/453
>> >> >>>
>> >> >>> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler
>> >> >>> <robby at eecs.northwestern.edu> wrote:
>> >> >>>>
>> >> >>>> I see that TR's type->contract returns
>> >> >>>>
>> >> >>>>   (-> (flat-named-contract (quote Float) flonum?)
>> >> >>>> (flat-named-contract
>> >> >>>> (quote
>> >> >>>> Float) flonum?))
>> >> >>>>
>> >> >>>> for the type (Float -> Float), but it could return
>> >> >>>>
>> >> >>>>   (-> (flat-named-contract (quote Float) flonum?) any)
>> >> >>>>
>> >> >>>> which wouldn't do any result value checking (this being different
>> >> >>>> from
>> >> >>>> any/c
>> >> >>>> as the range of the arrow contract).
>> >> >>>>
>> >> >>>> Robby
>> >> >>>>
>> >> >>>>
>> >> >>>> On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto
>> >> >>>> <neil.toronto at gmail.com>
>> >> >>>> wrote:
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> On 12/11/2013 02:49 PM, Neil Toronto wrote:
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> On 12/11/2013 01:55 PM, Stephen Bloch wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>> numeric primitives implemented in Typed Racket are faster
>> >> >>>>>>>>> than
>> >> >>>>>>>>> the
>> >> >>>>>>>>> same primitives implemented in C.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Whoa!  How did that happen?
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>> Whoa! That's not what I meant! O_o
>> >> >>>>>>
>> >> >>>>>> I said "we might be getting close" to that. I haven't tried
>> >> >>>>>> porting
>> >> >>>>>> a
>> >> >>>>>> numeric C primitive to TR yet, but I have a hunch that it'll
>> >> >>>>>> still
>> >> >>>>>> be
>> >> >>>>>> slower. I'll try one now and report what I find.
>> >> >>>>>>
>> >> >>>>>> Neil ⊥
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> I can't figure out why `flsinh' is faster to call from untyped
>> >> >>>>> Racket
>> >> >>>>> than
>> >> >>>>> `sinh'. All my tests with a Typed Racket `magnitude' show calls
>> >> >>>>> from
>> >> >>>>> untyped
>> >> >>>>> code are significantly slower, except in the one case that it
>> >> >>>>> computes
>> >> >>>>> Euclidean distance. That case is only twice as slow.
>> >> >>>>>
>> >> >>>>> I've attached the benchmark program. The `magnitude*' function is
>> >> >>>>> more
>> >> >>>>> or
>> >> >>>>> less a direct translation of `magnitude' from "number.c" into
>> >> >>>>> Typed
>> >> >>>>> Racket.
>> >> >>>>> Here's a summary of the results I get on my computer, in
>> >> >>>>> milliseconds,
>> >> >>>>> for 5
>> >> >>>>> million calls from untyped Racket, by data type.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
>> >> >>>>>
>> >> >>>>> -------------------------------------------------------------------
>> >> >>>>> magnitude*         385      419      378     414         686
>> >> >>>>> magnitude           59       44       40      40         390
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> The only one that's close in relative terms is Float-Complex. The
>> >> >>>>> others
>> >> >>>>> just call `abs'. The decompiled code doesn't show any inlining of
>> >> >>>>> `magnitude', so this comparison should be good.
>> >> >>>>>
>> >> >>>>> I'll bet checking the return value contract (which is
>> >> >>>>> unnecessary)
>> >> >>>>> is
>> >> >>>>> the
>> >> >>>>> main slowdown. It has to check for number of values.
>> >> >>>>>
>> >> >>>>> For comparison, here are the timings for running the benchmarks
>> >> >>>>> in
>> >> >>>>> TR
>> >> >>>>> with
>> >> >>>>> #:no-optimize:
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
>> >> >>>>>
>> >> >>>>> -------------------------------------------------------------------
>> >> >>>>> magnitude*          45       70*      37     102*       318
>> >> >>>>> magnitude           61       45       39      91*       394
>> >> >>>>>
>> >> >>>>>                                                * = unexpectedly
>> >> >>>>> high
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> Here's what I understand from comparing the numbers:
>> >> >>>>>
>> >> >>>>>   * Except for non-fixnum integers, calling `magnitude' in TR is
>> >> >>>>> just
>> >> >>>>> as
>> >> >>>>> fast as in untyped Racket. I have no idea why it would be slower
>> >> >>>>> on
>> >> >>>>> big
>> >> >>>>> integers. That's just weird.
>> >> >>>>>
>> >> >>>>>   * Calling `abs' in Racket is faster than calling `scheme_abs'
>> >> >>>>> in
>> >> >>>>> C,
>> >> >>>>> except on rationals and big integers.
>> >> >>>>>
>> >> >>>>>   * Operating on flonums in Typed Racket, using generic numeric
>> >> >>>>> functions,
>> >> >>>>> is faster than doing the same in C.
>> >> >>>>>
>> >> >>>>> Overall, it looks like the TR code is within the same order of
>> >> >>>>> magnitude
>> >> >>>>> (pun not intended) as the C code. I would love to try this
>> >> >>>>> benchmark
>> >> >>>>> with
>> >> >>>>> either 1) a `magnitude*' with an `AnyValues' return type; or 2) a
>> >> >>>>> contract
>> >> >>>>> boundary that doesn't check TR's return types for first-order
>> >> >>>>> functions.
>> >> >>>>>
>> >> >>>>> (I managed to make a `magnitude*' with type Number -> AnyValues,
>> >> >>>>> but
>> >> >>>>> TR
>> >> >>>>> couldn't make a contract for it.)
>> >> >>>>>
>> >> >>>>> Neil ⊥
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> _________________________
>> >> >>>>>    Racket Developers list:
>> >> >>>>>    http://lists.racket-lang.org/dev
>> >> >>>>>
>> >> >>>>
>> >> >>>> _________________________
>> >> >>>>    Racket Developers list:
>> >> >>>>    http://lists.racket-lang.org/dev
>> >> >>>>
>> >> >>
>> >> >
>
>


Posted on the dev mailing list.