<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><br></div><div>[Sam, Vincent: I have no clue how TR can beat R by two orders of magnitude. See end of message.]</div><div><br></div><div>I wrote a little function to test your two implementations of sqrt on lists of random numbers:</div><div><br></div><div><div>(define (time-and-compare LARGE)</div><div> (define n (build-list LARGE (lambda (_) (random 10))))</div><div> (collect-garbage) (collect-garbage) (collect-garbage)</div><div> (define m (time (for-each approved-sqrt n)))</div><div> (collect-garbage) (collect-garbage) (collect-garbage)</div><div> (define l (time (for-each the-sqrt n)))</div><div> (void))</div></div><div><br></div><div>As you can see below, approved-sqrt is consistently faster on 'short' lists (100 numbers) than my-sqrt; on 'long' lists (10,000 numbers) the opposite is true. </div><div><br></div><div><br></div><div><div>On May 9, 2011, at 7:38 PM, Patrick King wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Question 1: Is my use of the temporary "error" worth it in terms of performance (the approved solution calculates (* x x) twice)? If it's not worth it, how big does the calculation have to become before it does? </blockquote><div><br></div><div>It depends on where in the inner loop your calculations are and how many your program must perform. See below. </div><div><br></div><blockquote type="cite">Can the compiler make such a temporary live only in a register?<br></blockquote><div><br></div><div>In principle yes. I am not sure about our JIT. </div><div><br></div><br><blockquote type="cite">Question 1a: How creeped are you by my changing the meaning of error for these few lines? I find myself doing stuff like this, in the name of my idea of readability, a lot.<br></blockquote><div><br></div><div>Not one bit. I tend to agree with readability, though I catch myself thinking in the back of my head that this is also good for performance. (It shouldn't make a difference.) </div><div><br></div><br><blockquote type="cite">Question 2: The approved solution favored (/ ... (+ x x)) over my (/ ... 2 x). Cost of division versus readability. Wasn't a conscious decision of mine at the time, but now I'm curious. How smart is the compiler when presented with a division by 2? With multiple recalls of the same value?</blockquote><div><br></div><div>The two versions differ in two regards: </div><div><br></div><div>1. Your version flips the +/- calculations on the errors. (I eliminated this, the performance results stay the same.) </div><div><br></div><div>2. You also replace an addition for duplication with a multiplication. A compiler would go the other way around (and I may do so too, and think how silly I am). </div><div><br></div><div>I find myself unable to explain the differences in performance. Someone should probably look at the assembly and check what's going on. </div><div><br></div><div><br></div><div>Here are five measurements for lists of 10,000 numbers: </div><br><blockquote type="cite">> (time-and-compare 10000)</blockquote><blockquote type="cite">cpu time: 1196 real time: 1223 gc time: 26</blockquote><blockquote type="cite">cpu time: 1177 real time: 1229 gc time: 27</blockquote><blockquote type="cite">> (time-and-compare 10000)</blockquote><blockquote type="cite">cpu time: 1192 real time: 1242 gc time: 27</blockquote><blockquote type="cite">cpu time: 1172 real time: 1250 gc time: 27</blockquote><blockquote type="cite">> (time-and-compare 10000)</blockquote><blockquote type="cite">cpu time: 1197 real time: 1308 gc time: 28</blockquote><blockquote type="cite">cpu time: 1172 real time: 1215 gc time: 27</blockquote><blockquote type="cite">> (time-and-compare 10000)</blockquote><blockquote type="cite">cpu time: 1198 real time: 1230 gc time: 29</blockquote><blockquote type="cite">cpu time: 1186 real time: 1286 gc time: 29</blockquote><blockquote type="cite">> (time-and-compare 10000)</blockquote><blockquote type="cite">cpu time: 1208 real time: 1289 gc time: 26</blockquote><blockquote type="cite">cpu time: 1194 real time: 1303 gc time: 28</blockquote><div><br></div><div>Now let's look at small values: </div><div><br></div><br><blockquote type="cite">> (time-and-compare 100)</blockquote><blockquote type="cite">cpu time: 27 real time: 28 gc time: 0</blockquote><blockquote type="cite">cpu time: 33 real time: 40 gc time: 0</blockquote><blockquote type="cite">> (time-and-compare 100)</blockquote><blockquote type="cite">cpu time: 28 real time: 29 gc time: 0</blockquote><blockquote type="cite">cpu time: 33 real time: 34 gc time: 0</blockquote><blockquote type="cite">> (time-and-compare 100)</blockquote><blockquote type="cite">cpu time: 27 real time: 30 gc time: 0</blockquote><blockquote type="cite">cpu time: 32 real time: 38 gc time: 0</blockquote><blockquote type="cite">> (time-and-compare 100)</blockquote><blockquote type="cite">cpu time: 30 real time: 31 gc time: 0</blockquote><blockquote type="cite">cpu time: 32 real time: 33 gc time: 0</blockquote><blockquote type="cite">> (time-and-compare 100)</blockquote><blockquote type="cite">cpu time: 26 real time: 27 gc time: 0</blockquote><blockquote type="cite">cpu time: 34 real time: 33 gc time: 0</blockquote><br></div><div><br></div>Now here is the scary part. Typed Racket is a TON faster and it was nearly no work to get types to work: <div><br></div><div><blockquote type="cite"><div><font class="Apple-style-span" color="#000000">> (time-and-compare 10000)</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 9 real time: 9 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 17 real time: 18 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">- : Boolean</font></div><div><font class="Apple-style-span" color="#000000">#t</font></div><div><font class="Apple-style-span" color="#000000">> (time-and-compare 10000)</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 24 real time: 34 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 17 real time: 18 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">- : Boolean</font></div><div><font class="Apple-style-span" color="#000000">#t</font></div><div><font class="Apple-style-span" color="#000000">> (time-and-compare 10000)</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 13 real time: 14 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 28 real time: 37 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">- : Boolean</font></div><div><font class="Apple-style-span" color="#000000">#t</font></div><div><font class="Apple-style-span" color="#000000">> (time-and-compare 10000)</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 27 real time: 28 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 17 real time: 18 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">- : Boolean</font></div><div><font class="Apple-style-span" color="#000000">#t</font></div><div><font class="Apple-style-span" color="#000000">> (time-and-compare 10000)</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 24 real time: 29 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">cpu time: 17 real time: 18 gc time: 0</font></div><div><font class="Apple-style-span" color="#000000">- : Boolean</font></div></blockquote><div><br></div><br><div><br></div><div><div>;; -----------------------------------------------------------------------------</div><div>#lang typed/racket</div><div><br></div><div>(define epsilon 1e-7)</div><div><br></div><div>(: approved-sqrt : Natural -> Real)</div><div>(define (approved-sqrt y)</div><div> (let loop [(x (/ y 2.0))]</div><div> (if (< (abs (- (* x x) y)) epsilon)</div><div> x</div><div> (loop (- x (/ (- (* x x) y) (+ x x)))))))</div><div><br></div><div>(: my-sqrt : Natural -> Real)</div><div>(define (my-sqrt y)</div><div> (let loop [(x (/ y 2.0))]</div><div> (let [(error (- y (* x x)))]</div><div> (if (< (abs error) epsilon) </div><div> x</div><div> (loop (+ x (/ error 2 x)))))))</div><div><br></div><div>(: time-and-compare : Natural -> Boolean)</div><div>(define (time-and-compare LARGE)</div><div> (define n (build-list LARGE (lambda (_) (random 10))))</div><div> (collect-garbage) (collect-garbage) (collect-garbage)</div><div> (define m (time (map approved-sqrt n)))</div><div> (collect-garbage) (collect-garbage) (collect-garbage)</div><div> (define l (time (map my-sqrt n)))</div><div> (andmap = m l))</div><div><br></div><div>(time-and-compare 1000)</div><div><br><div><br><div><br></div></div></div></div></div></body></html>