[racket] TR performance bug: removing variable ref slows things down substantially?

From: John Clements (clements at brinckerhoff.org)
Date: Sat Sep 15 16:51:46 EDT 2012

On Sep 14, 2012, at 5:14 PM, John Clements wrote:

> 
> On Sep 14, 2012, at 5:04 PM, Vincent St-Amour wrote:
> 
>> I can't reproduce what you describe.
>> 
>> With `flvector-set!':
>> cpu time: 3404 real time: 3405 gc time: 0
>> 
>> Without:
>> cpu time: 548 real time: 546 gc time: 64
>> 
>> Are you running the two versions in the same way?
> 
> I'm running out the door. Short answer: yes. I repeatedly commented the flvector-set! in and out, and ran it in DrR. I'll try to reproduce it again when I get home (perhaps many hours from now).

At home, on a two-core machine, with a fresh make (I didn't re-run configure), I see somewhat different answers but with one shared element: commenting out the flvector-set! *increases* the gc time.

More specifically, I get 7.9 seconds with no recorded GC. After commenting it out, I get 4.6 seconds… with about 0.9 seconds of GC. I see this in your numbers above, as well (though the speedup is much, much smaller than yours). Any idea why GC should increase? Also, I am quite curious as to why you get 7x speedup, and I only get 2x.

John

> 
> Thanks!
> 
> John
> 
>> 
>> Vincent
>> 
>> 
>> At Fri, 14 Sep 2012 16:19:56 -0700,
>> John Clements wrote:
>>> 
>>> [1  <multipart/signed (7bit)>]
>>> [1.1  <text/plain; windows-1252 (quoted-printable)>]
>>> Okay, I think I have to start this e-mail with a big wow and thanks. This program runs about 4-5x faster in TR:
>>> 
>>> #lang typed/racket
>>> 
>>> (require racket/flonum)
>>> 
>>> (define f (make-flvector (* 10 44100)))
>>> 
>>> (define k (* 440.0 1/44100 2 pi))
>>> (time 
>>> (for ([j 100])
>>>  (for ([i (* 10 44100)]) 
>>>    (define i#i (exact->inexact i))
>>>    (void)
>>>    (flvector-set! f i 
>>>                   (sin (fl* k i#i))))))
>>> 
>>> This performs 44 million sine computations in about 4 seconds. So.. it's pretty quick. 
>>> 
>>> Here's the weird thing, though. When I comment out the whole flvector-set!, including the computation of the sine, it gets… slower. And performs lots of gc. Here's that program:
>>> 
>>> #lang typed/racket
>>> 
>>> (require racket/flonum)
>>> 
>>> (define f (make-flvector (* 10 44100)))
>>> 
>>> (define k (* 440.0 1/44100 2 pi))
>>> (time 
>>> (for ([j 100])
>>>  (for ([i (* 10 44100)]) 
>>>    (define i#i (exact->inexact i))
>>>    (void)
>>>    #;(flvector-set! f i 
>>>                   (sin (fl* k i#i))))))
>>> 
>>> This one runs in about 4.2 seconds, including almost 2 seconds of GC. And it doesn't compute a single sine!
>>> 
>>> I'm guessing that the first one somehow triggers an unboxing optimization that eliminates the allocation, but the size of this effect still flabbergasts me. I suppose at the end of the day I may just be asking for some dead code elimination, but it also seems problematic to me that eliminating uses of a variable would ever make the code run more slowly[*].
>>> 
>>> [*] Okay, I can think of one corner case: if the commented-out use of the variable could be shown to eliminate (through exceptional control flow) some of the possible values taken on by the variable. That can't be happening here, because there's no use of the variable after the commented-out one.
>>> 
>>> John
>>> 
>>> 
>>> 
>>> [1.2 smime.p7s <application/pkcs7-signature (base64)>]
>>> 
>>> [2  <text/plain; us-ascii (7bit)>]
>>> ____________________
>>> Racket Users list:
>>> http://lists.racket-lang.org/users
> 
> ____________________
>  Racket Users list:
>  http://lists.racket-lang.org/users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4800 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20120915/cfa17d2d/attachment.p7s>

Posted on the users mailing list.