<div dir="ltr">Could you formulate some test cases to check for this behavior?<div><br></div><div>Robby</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Dec 23, 2012 at 7:07 PM, Neil Toronto <span dir="ltr"><<a href="mailto:neil.toronto@gmail.com" target="_blank">neil.toronto@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 12/22/2012 10:24 AM, Michael Filonenko wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Also, long double arithmetic requires setting "extended mode" flag on<br>
FPU, which forces the FPU to use 80-bit registers. The side effect on<br>
that flag is that the FPU gives slightly different (more accurate, but<br>
not IEEE-compliant) results for 64-bit operations. That is usually<br>
not a problem on machines who have SSE2 (introduced in Pentium 4 in<br>
2001). In presense of SSE2, Racket performs 64-bit operations solely<br>
on 64-bit SSE2 registers (see MZ_USE_JIT_SSE and --mfpmath=sse), so the<br>
results are IEEE-compliant. 80-bit operations are done on FPU anyway<br>
as SSE2 can not do them. Therefore, by setting the "extended mode" on<br>
FPU, we introduce a subtle difference in ordinary flonums, but only on<br>
old machines that do not have SSE2.<br>
</blockquote>
<br></div>
Where is a good place to read more about this? In particular, I'm interested in how IEEE-compliant SSE2 implementations tend to be compared to FPUs.<br>
<br>
About half the new flonum functions in the math library rely on perfectly compliant rounding for good accuracy in difficult parts of their domains. A few use double-double arithmetic to extend precision from 53-bit significands to 106 in these subdomains; most use a combination of precisions.<br>
<br>
If rounding isn't IEEE-compliant, these functions could suddenly become much less precise: from just one least significant digit wrong to four or five. That includes computations done *entirely in 80-bit registers*, which seems like it should be more accurate. For example, there's a heavily used macro that computes the result of `fl+' and its rounding error. If it were done entirely in 80 bits and then rounded to 64, the error computation would be totally bogus.<br>
<br>
I'd like to verify independently that this won't happen on machines with an Intel FPU manufactured within the last 12 years or an AMD FPU manufactured within the last 10 (i.e. those with SSE2). That would keep me happy, since FPUs older than those tended to rounding incorrectly anyway.<br>
<br>
Thanks!<span class="HOEnZb"><font color="#888888"><br>
Neil ⊥</font></span><div class="HOEnZb"><div class="h5"><br>
<br>
_________________________<br>
Racket Developers list:<br>
<a href="http://lists.racket-lang.org/dev" target="_blank">http://lists.racket-lang.org/<u></u>dev</a><br>
</div></div></blockquote></div><br></div>