<div dir="ltr">I expect that one'll still be horrible, then. :)</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Jan 14, 2014 at 12:37 PM, Neil Toronto <span dir="ltr"><<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
I don't think so. The flonum functions are Flonum -> Flonum, and `magnitude*' is Number -> Number.<br>
<br>
The best candidate from the math library is probably `mean'.<br>
<br>
Neil ⊥<div class="im"><br>
<br>
On 01/14/2014 11:24 AM, Robby Findler wrote:<br>
</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
Do any of these functions turn out to have case-> contracts by the time<br>
TR gets done with them?<br>
<br>
Robby<br>
<br>
<br>
<br>
On Tue, Jan 14, 2014 at 12:22 PM, Neil Toronto <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a><br></div><div class="im">
<mailto:<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a><u></u>>> wrote:<br>
<br>
    An update on my math-centered tests. The first is the built-in<br>
    `magnitude' vs. a TR near-transliteration of its C implementation,<br>
    called from *untyped* Racket. The numbers are in milliseconds, for 5<br>
    million calls, by input type:<br>
<br>
<br>
    Function         Flonum  Rational  Fixnum  Integer  Float-Complex<br></div>
    ------------------------------<u></u>__----------------------------<u></u>--__-------<div class="im"><br>
    Pre-static contracts:<br>
<br>
    magnitude*         385      419      378     414         686<br>
    magnitude           59       44       40      40         390<br>
    Post-static contracts:<br>
    magnitude*         175      196      164     195         475<br>
    magnitude           68       43       36      40         406<br>
<br>
    All but the Float-Complex case just return the absolute value of the<br>
    argument, so there's not much computation. They're dominated by<br>
    contract checking, and the speedup is 0.45 to 0.5. Nice. The<br>
    Float-Complex case does something substantive, and is close to C.<br>
    That's awesome.<br>
<br>
    So for really small functions (i.e. just a test and a call to<br>
    `abs'), writing them in C is still better. But for anything much<br>
    larger (that part of `magnitude*' is about 10 lines and 7-13 numeric<br>
    ops), a TR implementation isn't much slower than C (about 17%).<br>
<br>
    --<br>
<br>
    1 million iterations of some math library flonum functions, in TR,<br>
    untyped before Robby's changes, untyped after Robby's 12/12 changes,<br>
    and untyped after Eric's static contract changes, in milliseconds:<br>
<br>
    Function         TR     Untyped    Robby's   Robby's+Eric's<br></div>
    ------------------------------<u></u>__----------------------------<u></u>--__--<div class="im"><br>
    flrational?       5       322        98           37<br>
    flsinh           55       343       121           86<br>
    fllog1p          47       351       117           80<br>
    lg+              61       384       154          115<br>
    flgamma         165       521       262          234<br>
<br>
    So calling TR floating-point functions from untyped Racket takes<br>
    only 0.11 to 0.45 times the amount of time it took only a month ago,<br>
    with typical cases being about 0.25. That's awesome.<br>
<br>
    Neil ⊥<br>
<br>
<br>
    On 01/14/2014 10:27 AM, Eric Dobson wrote:<br>
<br>
        The changes to TR contract generation are now in at HEAD.<br>
<br>
        If you can find any cases where contracts are still slowing programs<br>
        down by a lot I'd like to take a look at them. (Excepting the<br>
        case of<br>
        structs where I know it is still an issue).<br>
<br>
        On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler<br>
        <<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a><br></div><div><div class="h5">
        <mailto:<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.<u></u>northwestern.edu</a>>> wrote:<br>
<br>
            Re-reading your message I see that you're not actually<br>
            asserting something<br>
            different from what I said, but just for some precision here<br>
            I wish to point<br>
            out that I wasn't basing my opinion on intuition from the<br>
            code, but on some<br>
            microbenchmark timings. (There was a much more substantial<br>
            difference<br>
            yesterday because the loop inside any-wrap/c wasn't as cheap<br>
            as it could<br>
            have been.)<br>
<br>
            I'd be interested to see if your improvements to<br>
            type->contract improve the<br>
            situation any! I expect they will make things better again<br>
            for the Number<br>
            case, but at the moment, there isn't a big difference.<br>
<br>
            Program 1:<br>
<br>
<br>
            #lang racket/base<br>
            (module m typed/racket/base<br>
                (: f (Any -> Any))<br>
                (define (f x) 1)<br>
                (provide f))<br>
            (require 'm)<br>
            (time<br>
               (for ([x (in-range 20000)])<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))<br>
<br>
            Timings:<br>
<br>
            cpu time: 142 real time: 142 gc time: 8<br>
            cpu time: 144 real time: 144 gc time: 7<br>
            cpu time: 144 real time: 143 gc time: 6<br>
            cpu time: 142 real time: 142 gc time: 6<br>
            cpu time: 142 real time: 142 gc time: 7<br>
            cpu time: 146 real time: 146 gc time: 6<br>
<br>
            Program 2:<br>
<br>
<br>
            #lang racket/base<br>
            (module m typed/racket/base<br>
                (: f (Any -> Integer))<br>
<br>
                (define (f x) 1)<br>
                (provide f))<br>
            (require 'm)<br>
            (time<br>
               (for ([x (in-range 20000)])<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)<br>
                 (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))<br>
<br>
<br>
            Timings:<br>
<br>
            cpu time: 139 real time: 138 gc time: 7<br>
            cpu time: 145 real time: 144 gc time: 7<br>
            cpu time: 140 real time: 140 gc time: 6<br>
            cpu time: 151 real time: 150 gc time: 6<br>
            cpu time: 139 real time: 138 gc time: 6<br>
            cpu time: 139 real time: 139 gc time: 8<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
            On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson<br></div></div>
            <<a href="mailto:eric.n.dobson@gmail.com" target="_blank">eric.n.dobson@gmail.com</a> <mailto:<a href="mailto:eric.n.dobson@gmail.com" target="_blank">eric.n.dobson@gmail.<u></u>com</a>>><div class="im">
<br>
            wrote:<br>
<br>
<br>
                any-wrap/c still requires the check for one value, while<br>
                any (which is<br>
                from Number not Any) does not. So I would still guess at<br>
                Number being<br>
                faster, but Robby's changes may make it so that inlining<br>
                and dead code<br>
                elimination can see through everything and turn it into<br>
                the same code.<br>
<br>
                On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler<br>
                <<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a><br></div><div class="im">
                <mailto:<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.<u></u>northwestern.edu</a>>> wrote:<br>
<br>
                    FWIW, my push speeds up the any-wrap/c<br>
                    implementation a bunch. Those two<br>
                    should have similar speeds after you get that, I guess.<br>
<br>
                    Robby<br>
<br>
<br>
                    On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto<br></div>
                    <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a> <mailto:<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a><u></u>>><div>
<div class="h5"><br>
                    wrote:<br>
<br>
<br>
                        I tried your branch that implements it and saw<br>
                        about 3.5x speedup for<br>
                        the<br>
                        `magnitude*' test. This is of course without<br>
                        Robby's recent first-order<br>
                        contract changes.<br>
<br>
                        (I think it's about 3.5x: I tried with<br>
                        magnitude* : Number -> Any first<br>
                        and got 2400ms on the easy tests. I changed it<br>
                        to magnitude* : Number<br>
                        -><br>
                        Number and got 690ms. Apparently, for an `Any'<br>
                        return type, an<br>
                        `any-wrap/c'<br>
                        contract is generated instead of nothing. If<br>
                        that's much faster to<br>
                        check<br>
                        than `number?', though, the speedup is even better.)<br>
<br>
                        I'd love to see this with Robby's recent<br>
                        changes. Hint? Nudge? Please?<br>
<br>
                        I didn't see very much speedup with arrays<br>
                        (about 1.2x). Speed tests on<br>
                        the math library's distribution objects were<br>
                        very interesting, though,<br>
                        and<br>
                        indicate why the arrays might not be much<br>
                        faster. Here's my test<br>
                        program:<br>
<br>
<br>
                        #lang racket<br>
<br>
                        (require math/distributions)<br>
<br>
                        (define d (normal-dist 0 1))<br>
<br>
                        (printf "pdf d 0~n")<br>
                        (for ([_  (in-range 5)])<br>
                            (time (for ([_  (in-range 100000)])<br>
                                    (pdf d 0))))<br>
                        (newline)<br>
<br>
                        (define p (distribution-pdf d))<br>
                        (printf "p 0~n")<br>
                        (for ([_  (in-range 5)])<br>
                            (time (for ([_  (in-range 100000)])<br>
                                    (p 0))))<br>
                        (newline)<br>
<br>
<br>
                        The two tests are equivalent, as `pdf' just<br>
                        pulls the pdf function out<br>
                        of<br>
                        the distribution struct and applies it. In TR,<br>
                        the tests are exactly<br>
                        the<br>
                        same speed (extremely fast). In untyped Racket,<br>
                        on the main branch, the<br>
                        second test is 16x faster, and on your branch,<br>
                        it's 44x faster. (It's<br>
                        still<br>
                        10x slower than TR on your branch, so again...<br>
                        I'd love to see your<br>
                        changes<br>
                        and Robby's together. :D)<br>
<br>
                        Neil ⊥<br>
<br>
<br>
                        On 12/12/2013 12:40 AM, Eric Dobson wrote:<br>
<br>
<br>
                            Removing the return value checking is in the<br>
                            works. It actually is<br>
                            removing all of the checks that would blame<br>
                            typed code, so higher<br>
                            order functions/datastructure get<br>
                            improvements too. It is actually<br>
                            functional the last time I checked, but<br>
                            lacking documentation which is<br>
                            what is holding up merging with mainline.<br>
<br></div></div>
                            <a href="https://github.com/plt/racket/__pull/453" target="_blank">https://github.com/plt/racket/<u></u>__pull/453</a><div class="im"><br>
                            <<a href="https://github.com/plt/racket/pull/453" target="_blank">https://github.com/plt/<u></u>racket/pull/453</a>><br>
<br>
                            On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler<br>
                            <<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.northwestern.edu</a><br></div><div class="im">
                            <mailto:<a href="mailto:robby@eecs.northwestern.edu" target="_blank">robby@eecs.<u></u>northwestern.edu</a>>> wrote:<br>
<br>
<br>
                                I see that TR's type->contract returns<br>
<br>
                                    (-> (flat-named-contract (quote<br>
                                Float) flonum?)<br>
                                (flat-named-contract<br>
                                (quote<br>
                                Float) flonum?))<br>
<br>
                                for the type (Float -> Float), but it<br>
                                could return<br>
<br>
                                    (-> (flat-named-contract (quote<br>
                                Float) flonum?) any)<br>
<br>
                                which wouldn't do any result value<br>
                                checking (this being different<br>
                                from<br>
                                any/c<br>
                                as the range of the arrow contract).<br>
<br>
                                Robby<br>
<br>
<br>
                                On Wed, Dec 11, 2013 at 6:18 PM, Neil<br>
                                Toronto<br>
                                <<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a><br></div>
                                <mailto:<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a><u></u>>><div><div class="h5"><br>
                                wrote:<br>
<br>
<br>
<br>
                                    On 12/11/2013 02:49 PM, Neil Toronto<br>
                                    wrote:<br>
<br>
<br>
<br>
                                        On 12/11/2013 01:55 PM, Stephen<br>
                                        Bloch wrote:<br>
<br>
<br>
<br>
                                                On Dec 11, 2013, at 2:36<br>
                                                PM, Neil Toronto wrote:<br>
<br>
                                                    numeric primitives<br>
                                                    implemented in Typed<br>
                                                    Racket are faster than<br>
                                                    the<br>
                                                    same primitives<br>
                                                    implemented in C.<br>
<br>
<br>
<br>
<br>
                                            Whoa!  How did that happen?<br>
<br>
<br>
<br>
<br>
                                        Whoa! That's not what I meant! O_o<br>
<br>
                                        I said "we might be getting<br>
                                        close" to that. I haven't tried<br>
                                        porting<br>
                                        a<br>
                                        numeric C primitive to TR yet,<br>
                                        but I have a hunch that it'll still<br>
                                        be<br>
                                        slower. I'll try one now and<br>
                                        report what I find.<br>
<br>
                                        Neil ⊥<br>
<br>
<br>
<br>
<br>
                                    I can't figure out why `flsinh' is<br>
                                    faster to call from untyped<br>
                                    Racket<br>
                                    than<br>
                                    `sinh'. All my tests with a Typed<br>
                                    Racket `magnitude' show calls from<br>
                                    untyped<br>
                                    code are significantly slower,<br>
                                    except in the one case that it<br>
                                    computes<br>
                                    Euclidean distance. That case is<br>
                                    only twice as slow.<br>
<br>
                                    I've attached the benchmark program.<br>
                                    The `magnitude*' function is<br>
                                    more<br>
                                    or<br>
                                    less a direct translation of<br>
                                    `magnitude' from "number.c" into Typed<br>
                                    Racket.<br>
                                    Here's a summary of the results I<br>
                                    get on my computer, in<br>
                                    milliseconds,<br>
                                    for 5<br>
                                    million calls from untyped Racket,<br>
                                    by data type.<br>
<br>
<br>
                                    Function         Flonum  Rational<br>
                                      Fixnum  Integer  Float-Complex<br></div></div>
                                    ------------------------------<u></u>__----------------------------<u></u>--__-------<div class="im"><br>
                                    magnitude*         385      419<br>
                                      378     414         686<br>
                                    magnitude           59       44<br>
                                       40      40         390<br>
<br>
<br>
                                    The only one that's close in<br>
                                    relative terms is Float-Complex. The<br>
                                    others<br>
                                    just call `abs'. The decompiled code<br>
                                    doesn't show any inlining of<br>
                                    `magnitude', so this comparison<br>
                                    should be good.<br>
<br>
                                    I'll bet checking the return value<br>
                                    contract (which is unnecessary)<br>
                                    is<br>
                                    the<br>
                                    main slowdown. It has to check for<br>
                                    number of values.<br>
<br>
                                    For comparison, here are the timings<br>
                                    for running the benchmarks in<br>
                                    TR<br>
                                    with<br>
                                    #:no-optimize:<br>
<br>
<br>
                                    Function         Flonum  Rational<br>
                                      Fixnum  Integer  Float-Complex<br></div>
                                    ------------------------------<u></u>__----------------------------<u></u>--__-------<div><div class="h5"><br>
                                    magnitude*          45       70*<br>
                                      37     102*       318<br>
                                    magnitude           61       45<br>
                                       39      91*       394<br>
<br>
<br>
                                                 * = unexpectedly high<br>
<br>
<br>
                                    Here's what I understand from<br>
                                    comparing the numbers:<br>
<br>
                                        * Except for non-fixnum<br>
                                    integers, calling `magnitude' in TR is<br>
                                    just<br>
                                    as<br>
                                    fast as in untyped Racket. I have no<br>
                                    idea why it would be slower on<br>
                                    big<br>
                                    integers. That's just weird.<br>
<br>
                                        * Calling `abs' in Racket is<br>
                                    faster than calling `scheme_abs' in<br>
                                    C,<br>
                                    except on rationals and big integers.<br>
<br>
                                        * Operating on flonums in Typed<br>
                                    Racket, using generic numeric<br>
                                    functions,<br>
                                    is faster than doing the same in C.<br>
<br>
                                    Overall, it looks like the TR code<br>
                                    is within the same order of<br>
                                    magnitude<br>
                                    (pun not intended) as the C code. I<br>
                                    would love to try this benchmark<br>
                                    with<br>
                                    either 1) a `magnitude*' with an<br>
                                    `AnyValues' return type; or 2) a<br>
                                    contract<br>
                                    boundary that doesn't check TR's<br>
                                    return types for first-order<br>
                                    functions.<br>
<br>
                                    (I managed to make a `magnitude*'<br>
                                    with type Number -> AnyValues, but<br>
                                    TR<br>
                                    couldn't make a contract for it.)<br>
<br>
                                    Neil ⊥<br>
<br>
<br>
                                    _________________________<br>
                                         Racket Developers list:<br></div></div>
                                    <a href="http://lists.racket-lang.org/__dev" target="_blank">http://lists.racket-lang.org/_<u></u>_dev</a><div class="im"><br>
                                    <<a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/<u></u>dev</a>><br>
<br>
<br>
                                _________________________<br>
                                     Racket Developers list:<br></div>
                                <a href="http://lists.racket-lang.org/__dev" target="_blank">http://lists.racket-lang.org/_<u></u>_dev</a><br>
                                <<a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/<u></u>dev</a>><br>
<br>
<br>
<br>
<br>
<br>
</blockquote>
<br>
</blockquote></div><br></div>