<div dir="ltr">Do any of these functions turn out to have case-> contracts by the time TR gets done with them?<div> </div><div>Robby</div><div> </div></div><div class="gmail_extra"> <div class="gmail_quote"> On Tue, Jan 14, 2014 at 12:22 PM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> An update on my math-centered tests. The first is the built-in `magnitude' vs. a TR near-transliteration of its C implementation, called from *untyped* Racket. The numbers are in milliseconds, for 5 million calls, by input type:<div class="im"> Function Flonum Rational Fixnum Integer Float-Complex ------------------------------------------------------------------- </div> Pre-static contracts:<div class="im"> magnitude* 385 419 378 414 686 magnitude 59 44 40 40 390 </div> Post-static contracts: magnitude* 175 196 164 195 475 magnitude 68 43 36 40 406 All but the Float-Complex case just return the absolute value of the argument, so there's not much computation. They're dominated by contract checking, and the speedup is 0.45 to 0.5. Nice. The Float-Complex case does something substantive, and is close to C. That's awesome. So for really small functions (i.e. just a test and a call to `abs'), writing them in C is still better. But for anything much larger (that part of `magnitude*' is about 10 lines and 7-13 numeric ops), a TR implementation isn't much slower than C (about 17%). -- 1 million iterations of some math library flonum functions, in TR, untyped before Robby's changes, untyped after Robby's 12/12 changes, and untyped after Eric's static contract changes, in milliseconds: Function TR Untyped Robby's Robby's+Eric's -------------------------------------------------------------- flrational? 5 322 98 37 flsinh 55 343 121 86 fllog1p 47 351 117 80 lg+ 61 384 154 115 flgamma 165 521 262 234 So calling TR floating-point functions from untyped Racket takes only 0.11 to 0.45 times the amount of time it took only a month ago, with typical cases being about 0.25. That's awesome. Neil ⊥<div class="HOEnZb"><div class="h5"> On 01/14/2014 10:27 AM, Eric Dobson wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> The changes to TR contract generation are now in at HEAD. If you can find any cases where contracts are still slowing programs down by a lot I'd like to take a look at them. (Excepting the case of structs where I know it is still an issue). On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler <<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Re-reading your message I see that you're not actually asserting something different from what I said, but just for some precision here I wish to point out that I wasn't basing my opinion on intuition from the code, but on some microbenchmark timings. (There was a much more substantial difference yesterday because the loop inside any-wrap/c wasn't as cheap as it could have been.) I'd be interested to see if your improvements to type->contract improve the situation any! I expect they will make things better again for the Number case, but at the moment, there isn't a big difference. Program 1: #lang racket/base (module m typed/racket/base (: f (Any -> Any)) (define (f x) 1) (provide f)) (require 'm) (time (for ([x (in-range 20000)]) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7))) Timings: cpu time: 142 real time: 142 gc time: 8 cpu time: 144 real time: 144 gc time: 7 cpu time: 144 real time: 143 gc time: 6 cpu time: 142 real time: 142 gc time: 6 cpu time: 142 real time: 142 gc time: 7 cpu time: 146 real time: 146 gc time: 6 Program 2: #lang racket/base (module m typed/racket/base (: f (Any -> Integer)) (define (f x) 1) (provide f)) (require 'm) (time (for ([x (in-range 20000)]) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7))) Timings: cpu time: 139 real time: 138 gc time: 7 cpu time: 145 real time: 144 gc time: 7 cpu time: 140 real time: 140 gc time: 6 cpu time: 151 real time: 150 gc time: 6 cpu time: 139 real time: 138 gc time: 6 cpu time: 139 real time: 139 gc time: 8 On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <<a href="mailto:eric.n.dobson@gmail.com" target="_blank">eric.n.dobson@gmail.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> any-wrap/c still requires the check for one value, while any (which is from Number not Any) does not. So I would still guess at Number being faster, but Robby's changes may make it so that inlining and dead code elimination can see through everything and turn it into the same code. On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler <<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> FWIW, my push speeds up the any-wrap/c implementation a bunch. Those two should have similar speeds after you get that, I guess. Robby On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I tried your branch that implements it and saw about 3.5x speedup for the `magnitude*' test. This is of course without Robby's recent first-order contract changes. (I think it's about 3.5x: I tried with magnitude* : Number -> Any first and got 2400ms on the easy tests. I changed it to magnitude* : Number -> Number and got 690ms. Apparently, for an `Any' return type, an `any-wrap/c' contract is generated instead of nothing. If that's much faster to check than `number?', though, the speedup is even better.) I'd love to see this with Robby's recent changes. Hint? Nudge? Please? I didn't see very much speedup with arrays (about 1.2x). Speed tests on the math library's distribution objects were very interesting, though, and indicate why the arrays might not be much faster. Here's my test program: #lang racket (require math/distributions) (define d (normal-dist 0 1)) (printf "pdf d 0~n") (for ([_ (in-range 5)]) (time (for ([_ (in-range 100000)]) (pdf d 0)))) (newline) (define p (distribution-pdf d)) (printf "p 0~n") (for ([_ (in-range 5)]) (time (for ([_ (in-range 100000)]) (p 0)))) (newline) The two tests are equivalent, as `pdf' just pulls the pdf function out of the distribution struct and applies it. In TR, the tests are exactly the same speed (extremely fast). In untyped Racket, on the main branch, the second test is 16x faster, and on your branch, it's 44x faster. (It's still 10x slower than TR on your branch, so again... I'd love to see your changes and Robby's together. :D) Neil ⊥ On 12/12/2013 12:40 AM, Eric Dobson wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Removing the return value checking is in the works. It actually is removing all of the checks that would blame typed code, so higher order functions/datastructure get improvements too. It is actually functional the last time I checked, but lacking documentation which is what is holding up merging with mainline. <a href="https://github.com/plt/racket/pull/453" target="_blank">https://github.com/plt/racket/pull/453</a> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler <<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> I see that TR's type->contract returns (-> (flat-named-contract (quote Float) flonum?) (flat-named-contract (quote Float) flonum?)) for the type (Float -> Float), but it could return (-> (flat-named-contract (quote Float) flonum?) any) which wouldn't do any result value checking (this being different from any/c as the range of the arrow contract). Robby On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 12/11/2013 02:49 PM, Neil Toronto wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On 12/11/2013 01:55 PM, Stephen Bloch wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> numeric primitives implemented in Typed Racket are faster than the same primitives implemented in C. </blockquote></blockquote> Whoa! How did that happen? </blockquote> Whoa! That's not what I meant! O_o I said "we might be getting close" to that. I haven't tried porting a numeric C primitive to TR yet, but I have a hunch that it'll still be slower. I'll try one now and report what I find. Neil ⊥ </blockquote> I can't figure out why `flsinh' is faster to call from untyped Racket than `sinh'. All my tests with a Typed Racket `magnitude' show calls from untyped code are significantly slower, except in the one case that it computes Euclidean distance. That case is only twice as slow. I've attached the benchmark program. The `magnitude*' function is more or less a direct translation of `magnitude' from "number.c" into Typed Racket. Here's a summary of the results I get on my computer, in milliseconds, for 5 million calls from untyped Racket, by data type. Function Flonum Rational Fixnum Integer Float-Complex ------------------------------------------------------------------- magnitude* 385 419 378 414 686 magnitude 59 44 40 40 390 The only one that's close in relative terms is Float-Complex. The others just call `abs'. The decompiled code doesn't show any inlining of `magnitude', so this comparison should be good. I'll bet checking the return value contract (which is unnecessary) is the main slowdown. It has to check for number of values. For comparison, here are the timings for running the benchmarks in TR with #:no-optimize: Function Flonum Rational Fixnum Integer Float-Complex ------------------------------------------------------------------- magnitude* 45 70* 37 102* 318 magnitude 61 45 39 91* 394 * = unexpectedly high Here's what I understand from comparing the numbers: * Except for non-fixnum integers, calling `magnitude' in TR is just as fast as in untyped Racket. I have no idea why it would be slower on big integers. That's just weird. * Calling `abs' in Racket is faster than calling `scheme_abs' in C, except on rationals and big integers. * Operating on flonums in Typed Racket, using generic numeric functions, is faster than doing the same in C. Overall, it looks like the TR code is within the same order of magnitude (pun not intended) as the C code. I would love to try this benchmark with either 1) a `magnitude*' with an `AnyValues' return type; or 2) a contract boundary that doesn't check TR's return types for first-order functions. (I managed to make a `magnitude*' with type Number -> AnyValues, but TR couldn't make a contract for it.) Neil ⊥ _________________________ Racket Developers list: <a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/dev</a> </blockquote> _________________________ Racket Developers list: <a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/dev</a> </blockquote></blockquote> </blockquote> </blockquote></blockquote></blockquote></blockquote> </div></div></blockquote></div> </div>