On my laptop:<br><br>#lang racket<br><br>(require "fft.rkt")<br><br>(define v (build-vector 16384 (lambda (i) (random))))<br><br>(define v1 (vector-copy v))<br><br>(collect-garbage)<br>(collect-garbage)<br>(collect-garbage)<br>
(time (fft-complex-radix2-forward v1))<br><br>(define v2 (vector-copy v))<br><br>(collect-garbage)<br>(collect-garbage)<br>(collect-garbage)<br>(time (fft-complex-forward v2))<br><br>(for/and ([i (in-range (vector-length v))])<br>
(< (magnitude (- (vector-ref v1 i) (vector-ref v2 i))) 1e-4))<br><br>==><br><br>cpu time: 94 real time: 94 gc time: 0<br>cpu time: 79 real time: 78 gc time: 0<br>#t<br><br>I'll look at the rsound/fft implementation and see if I see the problem.<br>
<br><div class="gmail_quote">On Mon, Oct 18, 2010 at 10:32 PM, John Clements <span dir="ltr"><<a href="mailto:clements@brinckerhoff.org">clements@brinckerhoff.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div class="im"><br>
On Oct 18, 2010, at 9:25 AM, Doug Williams wrote:<br>
<br>
> When I first ran with vectors of 8192, I got exactly the opposite - the radix-2 version was much slower (although still < 500ms). But, when I looked, the longer time was almost exclusively GC - it just happen to hit that particular call. When I put a (collect-garbage) before each call, they were all < 50ms on my machine. So, the GC penalty is likely what you're seeing.<br>
><br>
> The other thing is to make sure you aren't calling one of the dft routines instead of the fft routines. Those are the only ones that get anywhere near 24s on my machine - they are about 20s on my machine. The dft routines are the 'straight' mathematical discrete Fourier transform and do not use the FFT algorithms at all. [They're really only useful for a sanity check on results.]<br>
<br>
</div>Right, got it. No, the times I'm seeing are not GC. I bumped it up to an 8192-point fft, and now the radix-2 version is about 400x faster.<br>
<br>
Here's my code:<br>
<br>
#lang racket<br>
<br>
(require (planet clements/rsound/fft))<br>
<br>
(define v (build-vector 16384 (lambda (i) (random))))<br>
<br>
(define v1 (vector-copy v))<br>
<br>
(collect-garbage)<br>
(collect-garbage)<br>
(collect-garbage)<br>
(time (fft-complex-radix2-forward v1))<br>
<br>
(define v2 (vector-copy v))<br>
<br>
(collect-garbage)<br>
(collect-garbage)<br>
(collect-garbage)<br>
(time (fft-complex-forward v2))<br>
<br>
(for/and ([i (in-range (vector-length v))])<br>
(< (magnitude (- (vector-ref v1 i) (vector-ref v2 i))) 1e-4))<br>
<br>
=><br>
<br>
cpu time: 208 real time: 211 gc time: 0<br>
cpu time: 81357 real time: 82502 gc time: 8337<br>
#t<br>
<font color="#888888"><br>
John Clements<br>
</font><div><div></div><div class="h5"><br>
> On Thu, Oct 14, 2010 at 4:34 PM, John Clements <<a href="mailto:clements@brinckerhoff.org">clements@brinckerhoff.org</a>> wrote:<br>
> On a vector of length 8192 (a power of 2, natch), fft-complex-radix2-forward takes about 1/8 of a second (lost in the noise, essentially), but fft-complex-forward takes more than 24 seconds, a difference of about 200x. They do produce the same answers, up to differences of 1e-4 in the magnitude (didn't check the phase).<br>
><br>
> 1) Is this expected? I thought the non-radix2 one was still fairly clever about subdividing when the number of points is divisible by 2.<br>
> 2) If so, would it make sense to test for powers of 2?<br>
><br>
> John Clements<br>
><br>
><br>
<br>
</div></div></blockquote></div><br>