I've reimplemented the statistics module from the science collection to use sequences instead of just vectors. I like the generality better - I can use any sequence (e.g., vector or list) - but there is more of performance hit than I would have liked. I haven't timed it with the new changes that Matthew just put it. The good news is that there isn't much of a hit for using (variance data) as opposed to (variance (in-vector data)) and there isn't a huge hit for using the contract that ensures that the sequence is a sequence of real numbers.<br>
<br>I created a 100000 element vector and timed a loop getting the variance of the elements 10 times. Note that I create an executable that runs compiled code in both cases. [Runs of the sequence code within DrScheme are about twice the times of the compiled code - I assume they run from byte code in that case. Runs of the science collection code is about the same in DrScheme - I assume they run the compiled code.]<br>
<br>Times using sequences [primarily using 'for/fold' for sequencing and referencing]:<br><br>(variance data) : cpu time: 625 real time: 625 gc time: 32<br>(unchecked-variance data) : cpu time: 531 real time: 531 gc time: 77<br>
<br>(variance (in-vector data)) : cpu time: 609 real time: 609 gc time: 16<br>(unchecked-variance (in-vector data)) : cpu time: 485 real time: 484 gc time: 0<br><br>Times using vectors (current science collection routines) [primarily using 'do' for sequencing with 'vector-ref' for referencing]:<br>
<br>(variance data) : cpu time: 235 real time: 234 gc time: 16<br>(unchecked-variance data) : cpu time: 187 real time: 188 gc time: 46<br><br>All of the normal caveats about timing values apply - just because I'm timing a statistics routine doesn't been it's statistically relevant :).<br>
<br>I will retime them when there is a nightly build with Matthew's performance improvements is available (it seems that 4.2.1.7 from Saturday is the latest) - or I build it on my machine at home. I don't have the development tools on my laptop to build from svn.<br>
<br>I've attached the files in case anyone wants to look them over. If someone could run them against the latest svn, it would be nice. [<br><br>Comments from anyone that uses these routines from the science collection would be most welcome.<br>
<br>Doug<br>