[racket-dev] [plt] Push #27909: master branch updated

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Thu Dec 12 13:52:55 EST 2013

Re-reading your message I see that you're not actually asserting something
different from what I said, but just for some precision here I wish to
point out that I wasn't basing my opinion on intuition from the code, but
on some microbenchmark timings. (There was a much more substantial
difference yesterday because the loop inside any-wrap/c wasn't as cheap as
it could have been.)

I'd be interested to see if your improvements to type->contract improve the
situation any! I expect they will make things better again for the Number
case, but at the moment, there isn't a big difference.

Program 1:

#lang racket/base
(module m typed/racket/base
  (: f (Any -> Any))
  (define (f x) 1)
  (provide f))
(require 'm)
(time
 (for ([x (in-range 20000)])
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))

Timings:

cpu time: 142 real time: 142 gc time: 8
cpu time: 144 real time: 144 gc time: 7
cpu time: 144 real time: 143 gc time: 6
cpu time: 142 real time: 142 gc time: 6
cpu time: 142 real time: 142 gc time: 7
cpu time: 146 real time: 146 gc time: 6

Program 2:

#lang racket/base
(module m typed/racket/base
  (: f (Any -> Integer))
  (define (f x) 1)
  (provide f))
(require 'm)
(time
 (for ([x (in-range 20000)])
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)
   (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7)))


Timings:

cpu time: 139 real time: 138 gc time: 7
cpu time: 145 real time: 144 gc time: 7
cpu time: 140 real time: 140 gc time: 6
cpu time: 151 real time: 150 gc time: 6
cpu time: 139 real time: 138 gc time: 6
cpu time: 139 real time: 139 gc time: 8







On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <eric.n.dobson at gmail.com>
wrote:
>
> any-wrap/c still requires the check for one value, while any (which is
> from Number not Any) does not. So I would still guess at Number being
> faster, but Robby's changes may make it so that inlining and dead code
> elimination can see through everything and turn it into the same code.
>
> On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler
> <robby at eecs.northwestern.edu> wrote:
> > FWIW, my push speeds up the any-wrap/c implementation a bunch. Those two
> > should have similar speeds after you get that, I guess.
> >
> > Robby
> >
> >
> > On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <neil.toronto at gmail.com>
> > wrote:
> >>
> >> I tried your branch that implements it and saw about 3.5x speedup for
the
> >> `magnitude*' test. This is of course without Robby's recent first-order
> >> contract changes.
> >>
> >> (I think it's about 3.5x: I tried with magnitude* : Number -> Any first
> >> and got 2400ms on the easy tests. I changed it to magnitude* : Number
->
> >> Number and got 690ms. Apparently, for an `Any' return type, an
`any-wrap/c'
> >> contract is generated instead of nothing. If that's much faster to
check
> >> than `number?', though, the speedup is even better.)
> >>
> >> I'd love to see this with Robby's recent changes. Hint? Nudge? Please?
> >>
> >> I didn't see very much speedup with arrays (about 1.2x). Speed tests on
> >> the math library's distribution objects were very interesting, though,
and
> >> indicate why the arrays might not be much faster. Here's my test
program:
> >>
> >>
> >> #lang racket
> >>
> >> (require math/distributions)
> >>
> >> (define d (normal-dist 0 1))
> >>
> >> (printf "pdf d 0~n")
> >> (for ([_  (in-range 5)])
> >>   (time (for ([_  (in-range 100000)])
> >>           (pdf d 0))))
> >> (newline)
> >>
> >> (define p (distribution-pdf d))
> >> (printf "p 0~n")
> >> (for ([_  (in-range 5)])
> >>   (time (for ([_  (in-range 100000)])
> >>           (p 0))))
> >> (newline)
> >>
> >>
> >> The two tests are equivalent, as `pdf' just pulls the pdf function out
of
> >> the distribution struct and applies it. In TR, the tests are exactly
the
> >> same speed (extremely fast). In untyped Racket, on the main branch, the
> >> second test is 16x faster, and on your branch, it's 44x faster. (It's
still
> >> 10x slower than TR on your branch, so again... I'd love to see your
changes
> >> and Robby's together. :D)
> >>
> >> Neil ⊥
> >>
> >>
> >> On 12/12/2013 12:40 AM, Eric Dobson wrote:
> >>>
> >>> Removing the return value checking is in the works. It actually is
> >>> removing all of the checks that would blame typed code, so higher
> >>> order functions/datastructure get improvements too. It is actually
> >>> functional the last time I checked, but lacking documentation which is
> >>> what is holding up merging with mainline.
> >>>
> >>> https://github.com/plt/racket/pull/453
> >>>
> >>> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler
> >>> <robby at eecs.northwestern.edu> wrote:
> >>>>
> >>>> I see that TR's type->contract returns
> >>>>
> >>>>   (-> (flat-named-contract (quote Float) flonum?)
(flat-named-contract
> >>>> (quote
> >>>> Float) flonum?))
> >>>>
> >>>> for the type (Float -> Float), but it could return
> >>>>
> >>>>   (-> (flat-named-contract (quote Float) flonum?) any)
> >>>>
> >>>> which wouldn't do any result value checking (this being different
from
> >>>> any/c
> >>>> as the range of the arrow contract).
> >>>>
> >>>> Robby
> >>>>
> >>>>
> >>>> On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto <neil.toronto at gmail.com
>
> >>>> wrote:
> >>>>>
> >>>>>
> >>>>> On 12/11/2013 02:49 PM, Neil Toronto wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 12/11/2013 01:55 PM, Stephen Bloch wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote:
> >>>>>>>>
> >>>>>>>>> numeric primitives implemented in Typed Racket are faster than
the
> >>>>>>>>> same primitives implemented in C.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Whoa!  How did that happen?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Whoa! That's not what I meant! O_o
> >>>>>>
> >>>>>> I said "we might be getting close" to that. I haven't tried
porting a
> >>>>>> numeric C primitive to TR yet, but I have a hunch that it'll still
be
> >>>>>> slower. I'll try one now and report what I find.
> >>>>>>
> >>>>>> Neil ⊥
> >>>>>
> >>>>>
> >>>>>
> >>>>> I can't figure out why `flsinh' is faster to call from untyped
Racket
> >>>>> than
> >>>>> `sinh'. All my tests with a Typed Racket `magnitude' show calls from
> >>>>> untyped
> >>>>> code are significantly slower, except in the one case that it
computes
> >>>>> Euclidean distance. That case is only twice as slow.
> >>>>>
> >>>>> I've attached the benchmark program. The `magnitude*' function is
more
> >>>>> or
> >>>>> less a direct translation of `magnitude' from "number.c" into Typed
> >>>>> Racket.
> >>>>> Here's a summary of the results I get on my computer, in
milliseconds,
> >>>>> for 5
> >>>>> million calls from untyped Racket, by data type.
> >>>>>
> >>>>>
> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
> >>>>> -------------------------------------------------------------------
> >>>>> magnitude*         385      419      378     414         686
> >>>>> magnitude           59       44       40      40         390
> >>>>>
> >>>>>
> >>>>> The only one that's close in relative terms is Float-Complex. The
> >>>>> others
> >>>>> just call `abs'. The decompiled code doesn't show any inlining of
> >>>>> `magnitude', so this comparison should be good.
> >>>>>
> >>>>> I'll bet checking the return value contract (which is unnecessary)
is
> >>>>> the
> >>>>> main slowdown. It has to check for number of values.
> >>>>>
> >>>>> For comparison, here are the timings for running the benchmarks in
TR
> >>>>> with
> >>>>> #:no-optimize:
> >>>>>
> >>>>>
> >>>>> Function         Flonum  Rational  Fixnum  Integer  Float-Complex
> >>>>> -------------------------------------------------------------------
> >>>>> magnitude*          45       70*      37     102*       318
> >>>>> magnitude           61       45       39      91*       394
> >>>>>
> >>>>>                                                * = unexpectedly high
> >>>>>
> >>>>>
> >>>>> Here's what I understand from comparing the numbers:
> >>>>>
> >>>>>   * Except for non-fixnum integers, calling `magnitude' in TR is
just
> >>>>> as
> >>>>> fast as in untyped Racket. I have no idea why it would be slower on
big
> >>>>> integers. That's just weird.
> >>>>>
> >>>>>   * Calling `abs' in Racket is faster than calling `scheme_abs' in
C,
> >>>>> except on rationals and big integers.
> >>>>>
> >>>>>   * Operating on flonums in Typed Racket, using generic numeric
> >>>>> functions,
> >>>>> is faster than doing the same in C.
> >>>>>
> >>>>> Overall, it looks like the TR code is within the same order of
> >>>>> magnitude
> >>>>> (pun not intended) as the C code. I would love to try this benchmark
> >>>>> with
> >>>>> either 1) a `magnitude*' with an `AnyValues' return type; or 2) a
> >>>>> contract
> >>>>> boundary that doesn't check TR's return types for first-order
> >>>>> functions.
> >>>>>
> >>>>> (I managed to make a `magnitude*' with type Number -> AnyValues,
but TR
> >>>>> couldn't make a contract for it.)
> >>>>>
> >>>>> Neil ⊥
> >>>>>
> >>>>>
> >>>>> _________________________
> >>>>>    Racket Developers list:
> >>>>>    http://lists.racket-lang.org/dev
> >>>>>
> >>>>
> >>>> _________________________
> >>>>    Racket Developers list:
> >>>>    http://lists.racket-lang.org/dev
> >>>>
> >>
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/dev/archive/attachments/20131212/1055d4a6/attachment-0001.html>

Posted on the dev mailing list.