[racket] TR req: could 'modulo' be more optimized or specialized?
NB: you may wish to read the last sentences of this e-mail before the rest:
Currently, the "modulo" function seems to perform a lot of allocation. Specifically, here's a program that copies one flvector into another one:
#lang typed/racket
(require racket/flonum)
(define f (make-flvector (* 10 44100)))
(define k (* 440.0 1/44100 2 pi))
(define g (make-flvector 44100))
(for ([i 44100])
(define i#i (exact->inexact i))
(flvector-set! g i (sin (* k i#i))))
(time
(for ([j 100])
(define ctr 0)
(for ([i (* 10 44100)])
(flvector-set! f i
(flvector-ref g ctr #;(modulo i 44100)))
(set! ctr (cond [(< ctr 44099) (add1 ctr)]
[else 0])))))
It runs in only two seconds, because I'm doing nasty mutation to avoid the modulo. If you comment out the explicit counter and comment in the call to modulo, you get this:
#lang typed/racket
(require racket/flonum)
(define f (make-flvector (* 10 44100)))
(define k (* 440.0 1/44100 2 pi))
(define g (make-flvector 44100))
(for ([i 44100])
(define i#i (exact->inexact i))
(flvector-set! g i (sin (* k i#i))))
(time
(for ([j 100])
(define ctr 0)
(for ([i (* 10 44100)])
(flvector-set! f i
(flvector-ref g #;ctr (modulo i 44100)))
#;(set! ctr (cond [(< ctr 44099) (add1 ctr)]
[else 0])))))
… which takes *five* seconds, including 2 seconds of GC.
It certainly seems to me like modulo could do its work without a lot of gc. I'm guessing that the basic issue here is that modulo isn't usually a high-priority item in inner loops.
In fact, it occurs to me that a better way to solve this is probably to write a looping (in-range/modulo …) generator.
I'm going to send this mail anyhow, just in case there's low-hanging fruit here, or some kind of bug.
John
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4800 bytes
Desc: not available
URL: <http://lists.racket-lang.org/users/archive/attachments/20120914/88894ce5/attachment.p7s>