<div dir="ltr">Do any of these functions turn out to have case-> contracts by the time TR gets done with them?<div><br></div><div>Robby</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">
On Tue, Jan 14, 2014 at 12:22 PM, Neil Toronto <span dir="ltr"><<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
An update on my math-centered tests. The first is the built-in `magnitude' vs. a TR near-transliteration of its C implementation, called from *untyped* Racket. The numbers are in milliseconds, for 5 million calls, by input type:<div class="im">
<br>
<br>
Function Flonum Rational Fixnum Integer Float-Complex<br>
------------------------------<u></u>------------------------------<u></u>-------<br></div>
Pre-static contracts:<div class="im"><br>
magnitude* 385 419 378 414 686<br>
magnitude 59 44 40 40 390<br></div>
Post-static contracts:<br>
magnitude* 175 196 164 195 475<br>
magnitude 68 43 36 40 406<br>
<br>
All but the Float-Complex case just return the absolute value of the argument, so there's not much computation. They're dominated by contract checking, and the speedup is 0.45 to 0.5. Nice. The Float-Complex case does something substantive, and is close to C. That's awesome.<br>
<br>
So for really small functions (i.e. just a test and a call to `abs'), writing them in C is still better. But for anything much larger (that part of `magnitude*' is about 10 lines and 7-13 numeric ops), a TR implementation isn't much slower than C (about 17%).<br>
<br>
--<br>
<br>
1 million iterations of some math library flonum functions, in TR, untyped before Robby's changes, untyped after Robby's 12/12 changes, and untyped after Eric's static contract changes, in milliseconds:<br>
<br>
Function TR Untyped Robby's Robby's+Eric's<br>
------------------------------<u></u>------------------------------<u></u>--<br>
flrational? 5 322 98 37<br>
flsinh 55 343 121 86<br>
fllog1p 47 351 117 80<br>
lg+ 61 384 154 115<br>
flgamma 165 521 262 234<br>
<br>
So calling TR floating-point functions from untyped Racket takes only 0.11 to 0.45 times the amount of time it took only a month ago, with typical cases being about 0.25. That's awesome.<span class="HOEnZb"><font color="#888888"><br>
<br>
Neil ⊥</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
On 01/14/2014 10:27 AM, Eric Dobson wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
The changes to TR contract generation are now in at HEAD.<br>
<br>
If you can find any cases where contracts are still slowing programs<br>
down by a lot I'd like to take a look at them. (Excepting the case of<br>
structs where I know it is still an issue).<br>
<br>
On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler<br>
<<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a>> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Re-reading your message I see that you're not actually asserting something<br>
different from what I said, but just for some precision here I wish to point<br>
out that I wasn't basing my opinion on intuition from the code, but on some<br>
microbenchmark timings. (There was a much more substantial difference<br>
yesterday because the loop inside any-wrap/c wasn't as cheap as it could<br>
have been.)<br>
<br>
I'd be interested to see if your improvements to type->contract improve the<br>
situation any! I expect they will make things better again for the Number<br>
case, but at the moment, there isn't a big difference.<br>
<br>
Program 1:<br>
<br>
<br>
#lang racket/base<br>
(module m typed/racket/base<br>
(: f (Any -> Any))<br>
(define (f x) 1)<br>
(provide f))<br>
(require 'm)<br>
(time<br>
(for ([x (in-range 20000)])<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))<br>
<br>
Timings:<br>
<br>
cpu time: 142 real time: 142 gc time: 8<br>
cpu time: 144 real time: 144 gc time: 7<br>
cpu time: 144 real time: 143 gc time: 6<br>
cpu time: 142 real time: 142 gc time: 6<br>
cpu time: 142 real time: 142 gc time: 7<br>
cpu time: 146 real time: 146 gc time: 6<br>
<br>
Program 2:<br>
<br>
<br>
#lang racket/base<br>
(module m typed/racket/base<br>
(: f (Any -> Integer))<br>
<br>
(define (f x) 1)<br>
(provide f))<br>
(require 'm)<br>
(time<br>
(for ([x (in-range 20000)])<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
(f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))<br>
<br>
<br>
Timings:<br>
<br>
cpu time: 139 real time: 138 gc time: 7<br>
cpu time: 145 real time: 144 gc time: 7<br>
cpu time: 140 real time: 140 gc time: 6<br>
cpu time: 151 real time: 150 gc time: 6<br>
cpu time: 139 real time: 138 gc time: 6<br>
cpu time: 139 real time: 139 gc time: 8<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <<a href="mailto:eric.n.dobson@gmail.com" target="_blank">eric.n.dobson@gmail.com</a>><br>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
any-wrap/c still requires the check for one value, while any (which is<br>
from Number not Any) does not. So I would still guess at Number being<br>
faster, but Robby's changes may make it so that inlining and dead code<br>
elimination can see through everything and turn it into the same code.<br>
<br>
On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler<br>
<<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a>> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
FWIW, my push speeds up the any-wrap/c implementation a bunch. Those two<br>
should have similar speeds after you get that, I guess.<br>
<br>
Robby<br>
<br>
<br>
On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>><br>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I tried your branch that implements it and saw about 3.5x speedup for<br>
the<br>
`magnitude*' test. This is of course without Robby's recent first-order<br>
contract changes.<br>
<br>
(I think it's about 3.5x: I tried with magnitude* : Number -> Any first<br>
and got 2400ms on the easy tests. I changed it to magnitude* : Number<br>
-><br>
Number and got 690ms. Apparently, for an `Any' return type, an<br>
`any-wrap/c'<br>
contract is generated instead of nothing. If that's much faster to<br>
check<br>
than `number?', though, the speedup is even better.)<br>
<br>
I'd love to see this with Robby's recent changes. Hint? Nudge? Please?<br>
<br>
I didn't see very much speedup with arrays (about 1.2x). Speed tests on<br>
the math library's distribution objects were very interesting, though,<br>
and<br>
indicate why the arrays might not be much faster. Here's my test<br>
program:<br>
<br>
<br>
#lang racket<br>
<br>
(require math/distributions)<br>
<br>
(define d (normal-dist 0 1))<br>
<br>
(printf "pdf d 0~n")<br>
(for ([_ (in-range 5)])<br>
(time (for ([_ (in-range 100000)])<br>
(pdf d 0))))<br>
(newline)<br>
<br>
(define p (distribution-pdf d))<br>
(printf "p 0~n")<br>
(for ([_ (in-range 5)])<br>
(time (for ([_ (in-range 100000)])<br>
(p 0))))<br>
(newline)<br>
<br>
<br>
The two tests are equivalent, as `pdf' just pulls the pdf function out<br>
of<br>
the distribution struct and applies it. In TR, the tests are exactly<br>
the<br>
same speed (extremely fast). In untyped Racket, on the main branch, the<br>
second test is 16x faster, and on your branch, it's 44x faster. (It's<br>
still<br>
10x slower than TR on your branch, so again... I'd love to see your<br>
changes<br>
and Robby's together. :D)<br>
<br>
Neil ⊥<br>
<br>
<br>
On 12/12/2013 12:40 AM, Eric Dobson wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Removing the return value checking is in the works. It actually is<br>
removing all of the checks that would blame typed code, so higher<br>
order functions/datastructure get improvements too. It is actually<br>
functional the last time I checked, but lacking documentation which is<br>
what is holding up merging with mainline.<br>
<br>
<a href="https://github.com/plt/racket/pull/453" target="_blank">https://github.com/plt/racket/<u></u>pull/453</a><br>
<br>
On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler<br>
<<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a>> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I see that TR's type->contract returns<br>
<br>
(-> (flat-named-contract (quote Float) flonum?)<br>
(flat-named-contract<br>
(quote<br>
Float) flonum?))<br>
<br>
for the type (Float -> Float), but it could return<br>
<br>
(-> (flat-named-contract (quote Float) flonum?) any)<br>
<br>
which wouldn't do any result value checking (this being different<br>
from<br>
any/c<br>
as the range of the arrow contract).<br>
<br>
Robby<br>
<br>
<br>
On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto<br>
<<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>><br>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
On 12/11/2013 02:49 PM, Neil Toronto wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
On 12/11/2013 01:55 PM, Stephen Bloch wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<br>
On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
numeric primitives implemented in Typed Racket are faster than<br>
the<br>
same primitives implemented in C.<br>
</blockquote></blockquote>
<br>
<br>
<br>
Whoa! How did that happen?<br>
</blockquote>
<br>
<br>
<br>
Whoa! That's not what I meant! O_o<br>
<br>
I said "we might be getting close" to that. I haven't tried porting<br>
a<br>
numeric C primitive to TR yet, but I have a hunch that it'll still<br>
be<br>
slower. I'll try one now and report what I find.<br>
<br>
Neil ⊥<br>
</blockquote>
<br>
<br>
<br>
I can't figure out why `flsinh' is faster to call from untyped<br>
Racket<br>
than<br>
`sinh'. All my tests with a Typed Racket `magnitude' show calls from<br>
untyped<br>
code are significantly slower, except in the one case that it<br>
computes<br>
Euclidean distance. That case is only twice as slow.<br>
<br>
I've attached the benchmark program. The `magnitude*' function is<br>
more<br>
or<br>
less a direct translation of `magnitude' from "number.c" into Typed<br>
Racket.<br>
Here's a summary of the results I get on my computer, in<br>
milliseconds,<br>
for 5<br>
million calls from untyped Racket, by data type.<br>
<br>
<br>
Function Flonum Rational Fixnum Integer Float-Complex<br>
------------------------------<u></u>------------------------------<u></u>-------<br>
magnitude* 385 419 378 414 686<br>
magnitude 59 44 40 40 390<br>
<br>
<br>
The only one that's close in relative terms is Float-Complex. The<br>
others<br>
just call `abs'. The decompiled code doesn't show any inlining of<br>
`magnitude', so this comparison should be good.<br>
<br>
I'll bet checking the return value contract (which is unnecessary)<br>
is<br>
the<br>
main slowdown. It has to check for number of values.<br>
<br>
For comparison, here are the timings for running the benchmarks in<br>
TR<br>
with<br>
#:no-optimize:<br>
<br>
<br>
Function Flonum Rational Fixnum Integer Float-Complex<br>
------------------------------<u></u>------------------------------<u></u>-------<br>
magnitude* 45 70* 37 102* 318<br>
magnitude 61 45 39 91* 394<br>
<br>
* = unexpectedly high<br>
<br>
<br>
Here's what I understand from comparing the numbers:<br>
<br>
* Except for non-fixnum integers, calling `magnitude' in TR is<br>
just<br>
as<br>
fast as in untyped Racket. I have no idea why it would be slower on<br>
big<br>
integers. That's just weird.<br>
<br>
* Calling `abs' in Racket is faster than calling `scheme_abs' in<br>
C,<br>
except on rationals and big integers.<br>
<br>
* Operating on flonums in Typed Racket, using generic numeric<br>
functions,<br>
is faster than doing the same in C.<br>
<br>
Overall, it looks like the TR code is within the same order of<br>
magnitude<br>
(pun not intended) as the C code. I would love to try this benchmark<br>
with<br>
either 1) a `magnitude*' with an `AnyValues' return type; or 2) a<br>
contract<br>
boundary that doesn't check TR's return types for first-order<br>
functions.<br>
<br>
(I managed to make a `magnitude*' with type Number -> AnyValues, but<br>
TR<br>
couldn't make a contract for it.)<br>
<br>
Neil ⊥<br>
<br>
<br>
_________________________<br>
Racket Developers list:<br>
<a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/<u></u>dev</a><br>
<br>
</blockquote>
<br>
_________________________<br>
Racket Developers list:<br>
<a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/<u></u>dev</a><br>
<br>
</blockquote></blockquote>
<br>
</blockquote>
<br>
</blockquote></blockquote></blockquote></blockquote>
<br>
</div></div></blockquote></div><br></div>