[racket-dev] long double for racket
On 12/22/2012 10:24 AM, Michael Filonenko wrote:
> Also, long double arithmetic requires setting "extended mode" flag on
> FPU, which forces the FPU to use 80-bit registers. The side effect on
> that flag is that the FPU gives slightly different (more accurate, but
> not IEEE-compliant) results for 64-bit operations. That is usually
> not a problem on machines who have SSE2 (introduced in Pentium 4 in
> 2001). In presense of SSE2, Racket performs 64-bit operations solely
> on 64-bit SSE2 registers (see MZ_USE_JIT_SSE and --mfpmath=sse), so the
> results are IEEE-compliant. 80-bit operations are done on FPU anyway
> as SSE2 can not do them. Therefore, by setting the "extended mode" on
> FPU, we introduce a subtle difference in ordinary flonums, but only on
> old machines that do not have SSE2.
Where is a good place to read more about this? In particular, I'm
interested in how IEEE-compliant SSE2 implementations tend to be
compared to FPUs.
About half the new flonum functions in the math library rely on
perfectly compliant rounding for good accuracy in difficult parts of
their domains. A few use double-double arithmetic to extend precision
from 53-bit significands to 106 in these subdomains; most use a
combination of precisions.
If rounding isn't IEEE-compliant, these functions could suddenly become
much less precise: from just one least significant digit wrong to four
or five. That includes computations done *entirely in 80-bit registers*,
which seems like it should be more accurate. For example, there's a
heavily used macro that computes the result of `fl+' and its rounding
error. If it were done entirely in 80 bits and then rounded to 64, the
error computation would be totally bogus.
I'd like to verify independently that this won't happen on machines with
an Intel FPU manufactured within the last 12 years or an AMD FPU
manufactured within the last 10 (i.e. those with SSE2). That would keep
me happy, since FPUs older than those tended to rounding incorrectly anyway.
Thanks!
Neil ⊥