[racket] 80-bit precision in Racket

From: Neil Toronto (neil.toronto at gmail.com)
Date: Sat Nov 10 16:41:44 EST 2012

On 11/09/2012 06:56 AM, Dmitry Pavlov wrote:
> 2. Maybe we are cutting blocks with a razor here?
> If there is any simpler way to have native 80-bit
> floating point in Racket (say Racket->ASM compilation,
> Racket->C, LLVM, anything), please do tell me.

Here's another razor/brick-related question: What does 80 bits get you?

I'm not staunchly defending 64-bit floats, and I'm not trying to turn 
down your kind offer to improve to Racket. I just wonder whether there's 
an easier way to get what you need.

Some problems bigger floats would help with are:

  1. Computing large values whose fractional parts are significant. 
Large physics simulations, for example: you don't want orbits to go 
erroneously wobbly far from the origin, where floats are much sparser.

  2. You need exponents greater than 308 and/or less than -308.

  3. You need to compute sort-of-stable solutions or long-running 
simulations accurately.

  4. You have some difficult computations that suffer from overflow, 
underflow, or severe cancellation.

If your problem is #4, I may have a good enough 64-bit solution.

An example of #4 is (x/k)^k * exp(k-x), which shows up embarrassingly 
often in gamma-related functions, and is impossible to compute 
accurately for large x and k without using more bits. (To avoid 
under-/overflow, you must accept cancellation errors.) If just this one 
function were computed accurately, all of these gamma-related functions 
could be computed accurately. As it is, nobody has ever implemented an 
accurate 64-bit incomplete gamma integral.

There's a body of work on floating-point "expansions," which represent 
numbers as the sum of multiple non-overlapping floats. Two 64-bit 
floats, for example, is equivalent to one 117-bit float (1 sign, 11 
exponent, 105 significand). I've just implemented 117-bit arithmetic, 
log1p, expm1, and exp. I'm going to use these judiciously, because 
they're a bit expensive; I think a two-float multiply is 17 flops.

I was planning to keep this part of the math library private on the 
first release, but I could put it in something like math/unstable for 
now if it could help with your problem.

But if what you need 80 bits for is pervasive, or if you're simply 
committed to it, I'll just look forward to having them. :)  I'll be 
happy to consult as well.

Neil ⊥


Posted on the users mailing list.