<div dir="ltr"><div>Re-reading your message I see that you're not actually asserting something different from what I said, but just for some precision here I wish to point out that I wasn't basing my opinion on intuition from the code, but on some microbenchmark timings. (There was a much more substantial difference yesterday because the loop inside any-wrap/c wasn't as cheap as it could have been.)</div> <div> </div><div>I'd be interested to see if your improvements to type->contract improve the situation any! I expect they will make things better again for the Number case, but at the moment, there isn't a big difference.</div> Program 1: #lang racket/base (module m typed/racket/base (: f (Any -> Any)) (define (f x) 1) (provide f)) (require 'm) (time (for ([x (in-range 20000)]) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7))) Timings: cpu time: 142 real time: 142 gc time: 8 cpu time: 144 real time: 144 gc time: 7 cpu time: 144 real time: 143 gc time: 6 cpu time: 142 real time: 142 gc time: 6 cpu time: 142 real time: 142 gc time: 7 cpu time: 146 real time: 146 gc time: 6 Program 2: #lang racket/base (module m typed/racket/base (: f (Any -> Integer)) (define (f x) 1) (provide f)) (require 'm) (time (for ([x (in-range 20000)]) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7))) Timings: cpu time: 139 real time: 138 gc time: 7 cpu time: 145 real time: 144 gc time: 7 cpu time: 140 real time: 140 gc time: 6 cpu time: 151 real time: 150 gc time: 6 cpu time: 139 real time: 138 gc time: 6 cpu time: 139 real time: 139 gc time: 8 On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <<a href="mailto:eric.n.dobson@gmail.com">eric.n.dobson@gmail.com</a>> wrote: > > any-wrap/c still requires the check for one value, while any (which is > from Number not Any) does not. So I would still guess at Number being > faster, but Robby's changes may make it so that inlining and dead code > elimination can see through everything and turn it into the same code. > > On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler > <<a href="mailto:robby@eecs.northwestern.edu">robby@eecs.northwestern.edu</a>> wrote: > > FWIW, my push speeds up the any-wrap/c implementation a bunch. Those two > > should have similar speeds after you get that, I guess. > > > > Robby > > > > > > On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com">neil.toronto@gmail.com</a>> > > wrote: > >> > >> I tried your branch that implements it and saw about 3.5x speedup for the > >> `magnitude*' test. This is of course without Robby's recent first-order > >> contract changes. > >> > >> (I think it's about 3.5x: I tried with magnitude* : Number -> Any first > >> and got 2400ms on the easy tests. I changed it to magnitude* : Number -> > >> Number and got 690ms. Apparently, for an `Any' return type, an `any-wrap/c' > >> contract is generated instead of nothing. If that's much faster to check > >> than `number?', though, the speedup is even better.) > >> > >> I'd love to see this with Robby's recent changes. Hint? Nudge? Please? > >> > >> I didn't see very much speedup with arrays (about 1.2x). Speed tests on > >> the math library's distribution objects were very interesting, though, and > >> indicate why the arrays might not be much faster. Here's my test program: > >> > >> > >> #lang racket > >> > >> (require math/distributions) > >> > >> (define d (normal-dist 0 1)) > >> > >> (printf "pdf d 0~n") > >> (for ([_ (in-range 5)]) > >> (time (for ([_ (in-range 100000)]) > >> (pdf d 0)))) > >> (newline) > >> > >> (define p (distribution-pdf d)) > >> (printf "p 0~n") > >> (for ([_ (in-range 5)]) > >> (time (for ([_ (in-range 100000)]) > >> (p 0)))) > >> (newline) > >> > >> > >> The two tests are equivalent, as `pdf' just pulls the pdf function out of > >> the distribution struct and applies it. In TR, the tests are exactly the > >> same speed (extremely fast). In untyped Racket, on the main branch, the > >> second test is 16x faster, and on your branch, it's 44x faster. (It's still > >> 10x slower than TR on your branch, so again... I'd love to see your changes > >> and Robby's together. :D) > >> > >> Neil ⊥ > >> > >> > >> On 12/12/2013 12:40 AM, Eric Dobson wrote: > >>> > >>> Removing the return value checking is in the works. It actually is > >>> removing all of the checks that would blame typed code, so higher > >>> order functions/datastructure get improvements too. It is actually > >>> functional the last time I checked, but lacking documentation which is > >>> what is holding up merging with mainline. > >>> > >>> <a href="https://github.com/plt/racket/pull/453">https://github.com/plt/racket/pull/453</a> > >>> > >>> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler > >>> <<a href="mailto:robby@eecs.northwestern.edu">robby@eecs.northwestern.edu</a>> wrote: > >>>> > >>>> I see that TR's type->contract returns > >>>> > >>>> (-> (flat-named-contract (quote Float) flonum?) (flat-named-contract > >>>> (quote > >>>> Float) flonum?)) > >>>> > >>>> for the type (Float -> Float), but it could return > >>>> > >>>> (-> (flat-named-contract (quote Float) flonum?) any) > >>>> > >>>> which wouldn't do any result value checking (this being different from > >>>> any/c > >>>> as the range of the arrow contract). > >>>> > >>>> Robby > >>>> > >>>> > >>>> On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com">neil.toronto@gmail.com</a>> > >>>> wrote: > >>>>> > >>>>> > >>>>> On 12/11/2013 02:49 PM, Neil Toronto wrote: > >>>>>> > >>>>>> > >>>>>> On 12/11/2013 01:55 PM, Stephen Bloch wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote: > >>>>>>>> > >>>>>>>>> numeric primitives implemented in Typed Racket are faster than the > >>>>>>>>> same primitives implemented in C. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Whoa! How did that happen? > >>>>>> > >>>>>> > >>>>>> > >>>>>> Whoa! That's not what I meant! O_o > >>>>>> > >>>>>> I said "we might be getting close" to that. I haven't tried porting a > >>>>>> numeric C primitive to TR yet, but I have a hunch that it'll still be > >>>>>> slower. I'll try one now and report what I find. > >>>>>> > >>>>>> Neil ⊥ > >>>>> > >>>>> > >>>>> > >>>>> I can't figure out why `flsinh' is faster to call from untyped Racket > >>>>> than > >>>>> `sinh'. All my tests with a Typed Racket `magnitude' show calls from > >>>>> untyped > >>>>> code are significantly slower, except in the one case that it computes > >>>>> Euclidean distance. That case is only twice as slow. > >>>>> > >>>>> I've attached the benchmark program. The `magnitude*' function is more > >>>>> or > >>>>> less a direct translation of `magnitude' from "number.c" into Typed > >>>>> Racket. > >>>>> Here's a summary of the results I get on my computer, in milliseconds, > >>>>> for 5 > >>>>> million calls from untyped Racket, by data type. > >>>>> > >>>>> > >>>>> Function Flonum Rational Fixnum Integer Float-Complex > >>>>> ------------------------------------------------------------------- > >>>>> magnitude* 385 419 378 414 686 > >>>>> magnitude 59 44 40 40 390 > >>>>> > >>>>> > >>>>> The only one that's close in relative terms is Float-Complex. The > >>>>> others > >>>>> just call `abs'. The decompiled code doesn't show any inlining of > >>>>> `magnitude', so this comparison should be good. > >>>>> > >>>>> I'll bet checking the return value contract (which is unnecessary) is > >>>>> the > >>>>> main slowdown. It has to check for number of values. > >>>>> > >>>>> For comparison, here are the timings for running the benchmarks in TR > >>>>> with > >>>>> #:no-optimize: > >>>>> > >>>>> > >>>>> Function Flonum Rational Fixnum Integer Float-Complex > >>>>> ------------------------------------------------------------------- > >>>>> magnitude* 45 70* 37 102* 318 > >>>>> magnitude 61 45 39 91* 394 > >>>>> > >>>>> * = unexpectedly high > >>>>> > >>>>> > >>>>> Here's what I understand from comparing the numbers: > >>>>> > >>>>> * Except for non-fixnum integers, calling `magnitude' in TR is just > >>>>> as > >>>>> fast as in untyped Racket. I have no idea why it would be slower on big > >>>>> integers. That's just weird. > >>>>> > >>>>> * Calling `abs' in Racket is faster than calling `scheme_abs' in C, > >>>>> except on rationals and big integers. > >>>>> > >>>>> * Operating on flonums in Typed Racket, using generic numeric > >>>>> functions, > >>>>> is faster than doing the same in C. > >>>>> > >>>>> Overall, it looks like the TR code is within the same order of > >>>>> magnitude > >>>>> (pun not intended) as the C code. I would love to try this benchmark > >>>>> with > >>>>> either 1) a `magnitude*' with an `AnyValues' return type; or 2) a > >>>>> contract > >>>>> boundary that doesn't check TR's return types for first-order > >>>>> functions. > >>>>> > >>>>> (I managed to make a `magnitude*' with type Number -> AnyValues, but TR > >>>>> couldn't make a contract for it.) > >>>>> > >>>>> Neil ⊥ > >>>>> > >>>>> > >>>>> _________________________ > >>>>> Racket Developers list: > >>>>> <a href="http://lists.racket-lang.org/dev">http://lists.racket-lang.org/dev</a> > >>>>> > >>>> > >>>> _________________________ > >>>> Racket Developers list: > >>>> <a href="http://lists.racket-lang.org/dev">http://lists.racket-lang.org/dev</a> > >>>> > >> > > </div>