[racket] ATTN: Neil T. Racket/Math digamma (psi) function implementation
On 05/07/2013 02:58 PM, Ray Racine wrote:
> Well since I gone this far, I took the unusual step of implementing the
> algorithm (more?) correctly this time.
>
> On the Toronto scale, we are mostly accurate for most of the domain x > 0.
I noticed. Cool!
> For x<0 the use of psi(1 - x) - psi (x) = pi * cot(pi * x), not so good.
That's because you use
(* PI (cotan (* z NEG-PI)))
as part of the reflection formula instead of
(* pi (flcotpix x))
IIRC, I wrote `flcotpix' to make `flpsi0' more accurate. (I already had
a similar thing, `flsinpix', for the gamma function.)
> For x > 0, outside the subdomain around the circle x=1.48 where things
> are a bit wobbly it is pretty solid. I don't know if this is a zero of
> the polynomial or what the heck is going on there.
It's a zero of the digamma function. Zeros are always tricky. The
smaller an output is, the more small changes affect the lowest few bits.
This zero is why `flpsi0' uses a separate Taylor series approximation
(centered at the zero, ~1.46163215590743) for inputs between 1.125 and
1.9. I would do something similar for the negative zeros, but there are
so many of them...
> ;; Improved Impl of digamma
>
> #lang typed/racket/base
>
> [...]
I guessed this would be faster and just as accurate in [10,44]. You've
surprised me: it's faster and just as accurate in [8,44]. :D My
measurements have it 2x faster near 8, and 3x faster near 44. (After 44,
`flpsi0' switches to an even faster asymptotic expansion.) Nice.
I managed to get it <= 2 ulps in [4,44], though, by changing the order
of operations. Notice that this:
(+ (+ z-shift (- (fllog z) (/ 0.5 z)))
(digamma-poly (/ (* z z))))
is four additions. Just to check whether changing the order of the
additions could help, I changed it to this:
(flsum (list z-shift
(fllog z)
(/ -0.5 z)
(digamma-poly (/ (* z z))))
This got <= 2 ulps in [4,44], but was dog slow from creating lists. The
most important thing is to add from lowest to highest magnitude (though
`flsum' does something a little better). So I plotted the absolute value
of each term, and discovered that this always adds smallest to largest
for z in [4,10], where adding more accurately makes a difference:
(+ (digamma-poly (/ (* z z)))
(/ -0.5 z)
z-shift
(fllog z))
Anyway, this should make a nice drop-in replacement for the two slowish
Chebyshev approximations. Thanks!
Neil ⊥