[racket-dev] Extflonum type for windows
Sorry for the long delay!
At Wed, 20 Mar 2013 16:14:46 +0400, Michael Filonenko wrote:
> Agreed. But since switching the processor "at last minute" every time
> slows things down a bit, it may be useful to have an option to
> switch to the extended mode on Win32 just once. That will be useful
> for us and other users who have SSE and worry about performance.
I worry that system libraries or other libraries may somehow rely on
double precision, and so setting the mode globally to extended
precision may cause problems --- independent of whether the rest of
Racket uses SSE for double-precision arithmetic.
But many computations are likely to work anyway, so let's set that
concern aside for the purposes of someone who wants try extended
precision by default...
> There is a caveat that londgouble.dll should be present in two Win32
> versions -- with and without the switching.
I've changed "longdouble.dll" so that it switches the floating-point
mode and then restores it on each call, instead of always setting the
mode back to double precision after each call. That makes the DLL work
consistently with different default modes.
(Although the DLL's overhead could be lower if the context is known to
be in extended-precision mode already, I think the overhead of getting
into the DLL is so large that skipping the no-op mode switching
wouldn't matter.)
Now, you can adjust the 32-bit MSVC build so that MZ_NEED_SET_EXTFL_MODE
is not defined in "sconfig.h", and then if you add initialization to
set the x87 control word to extended precision, maybe extflonums would
work right.
Further, to avoid affecting flonums, you could get the JIT to use SSE
by defining MZ_USE_JIT_SSE. That flag doesn't cover MSVC-generated
`double' arithmetic, so you'd also have to tell MSVC to use SSE for
floating-point math. Then again, MSVC doesn't seem to have a way to say
"use only SSE"; there's a flag to enable SSE, but the compiler reserves
the right to mix SSE and x87, depending on what it thinks will be
faster.
Another possible direction is to use MinGW for a faster-extflonum
build. I've pushed some repairs so that Racket again builds with MinGW
(only extflonum-related updates were needed), and a 32-bit build using
CPPFLAGS="-mpentium4 -mfpmath=sse" provides extflonums without
last-minute switching. A 64-bit MinGW build by default uses SEE, so it
also provides extflonums without switching.
Note that a MinGW build no longer has the same behavior as an MSVC
build when SSE arithmetic is enabled for gcc, since SSE mode makes
Racket add initialization of the default floating-point mode to
extended to support extflonums. If we ever want to bring the two builds
back in sync, we can add an option. For now, though, the MinGW build
just provides a path for experimenting with floating-point modes.