[racket] why does in-indexed / in-range slow down my program?

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Sun Jan 20 18:21:51 EST 2013

I can confirm a serious speed-up with the explicit vector-ref inside the for loop instead of the for-loop lookup with (in-vector ...). I get from 20s to 12s on my reasonably new (small) macair. 

When you look at the expansion, you see the expected two-argument for-loop function with two arguments instead of one. You will also see an unsafe-vector-ref for the for-loop, while our code contains a normal vector-ref. So the question is whether the compiler generates code for two-argument recursive functions that is so much worse than for a one-argument function with a safe vector ref. Curious. 

-- Matthias

p.s. The code below is a bit more Rackety. 



;; --------------------------------------------------------
;; quick-find.rkt
;; Usage: racket quick-find.rkt < pairs.txt

#lang racket/base

(require racket/list)
(require racket/port)
(require racket/string)
(require profile)

; (module+ main (profile-thunk main))

;; given version, with output to a string to avoid drracket output
;; cpu time: 21699 real time: 21725 gc time: 1633

;; reading numbers directly shaves off a bit of time:
;; cpu time: 20112 real time: 20121 gc time: 105

;; with v replaced by vector-ref 
;; cpu time: 12974 real time: 12985 gc time: 90

(define (main)
  (define n (string->number (read-line)))
  [define set (make-qf-set n)]
  (let loop ()
    ;; MF: change 1 
    ;; the lines come in pairs so this should be okay for well-formed files 
    (define p (read))
    (define q (read))
    (unless (eof-object? p)
      (unless (qf-connected? set p q)
        (qf-connect! set p q)
        (printf "connected ~a to ~a\n" p q))
      (loop)))
  (define components
    ;;  [0 1 2 3 4 5 6 7 8 5196 10  ... 79659 135339 135340 135341]
    (string-join (for/list ([n (in-vector (qf-items set))]) (number->string n))
                 #:before-first "[" #:after-last "]"))
  (printf "Quick-find Set with ~a connected components: ~a\n" (qf-count set) components))

(struct qf-set ([count #:mutable] items) #:transparent)
(define (make-qf-set n) (qf-set n (make-qf-items n)))
;; MF: building the vector directly 
(define (make-qf-items n) (build-vector n values))

(define (qf-connected? set p q)
  (= (qf-find set p)
     (qf-find set q)))

(define (qf-connect! set p q)
  [define p-root (qf-find set p)]
  [define q-root (qf-find set q)]
  [define items (qf-set-items set)]
  [define n (vector-length items)]
  (unless (= p-root q-root)
    (for ([i (in-range n)] ;; it is weird that adding this line slows down the program by about 10s
          #;
          [v (in-vector items)])
      (when (= (vector-ref items i) #;v p-root) (vector-set! items i q-root)))
    (set-qf-set-count! set (- (qf-set-count set) 1))))

(define (qf-find set p)
  [define items (qf-set-items set)]
  (if (p . < . (vector-length items))
      (vector-ref items p)
      #f))

(define qf-count qf-set-count)
(define (qf-items set)
  (vector->immutable-vector (qf-set-items set)))

(define x (time (with-output-to-string (λ () (with-input-from-file "pairs.txt" main)))))



On Jan 20, 2013, at 2:25 AM, Limbo Peng wrote:

> Hi all,
> 
> I've implemented the quick-find algorithm in Racket (you can find it here [0], or see code attached below), and found some weirdness during profiling.
> 
> (*DISCLAIMER*: I know there's a faster algorithm called quick-union, but let's focus on this one.)
> 
> The first version of my program runs extremely slow, taking over 4 minutes to process about 23,000 lines of data (sample data is here: [1]), while the same algorithm implemented in Go (you can find it here: [2]) can finish within 15s. So I added profiling code in Racket to track down the part that slows it down.
> 
> As the profiler shows, the most time-consuming part lies in the for-loop of procedure "qf-connect!", and after a few experiments, I figure it out: it is the "in-indexed" procedure that makes it slow. I replaced "in-indexed" with separate calls to "in-range" and "in-vector" - the running time gets reduced from 4m to 27s. Why?
> 
> And it's not the end of the story: if I removed the call to "in-range" or "in-vector" (which makes the algorithm not working correctly, of course), it can speed up further down to 18s. (Again) Why?
> 
> I'm new to Racket and not clear about the performance implication of the for-loop usage. Any suggestion or hints?
> 
> Thanks in advance.
> 
> [0]: https://gist.github.com/4576816
> [1]: http://static.iwinux.info/pairs.txt
> [2]: http://github.com/iwinux/gomisc/tree/master/algorithms/union-find
> 
> ;; --------------------------------------------------------
> ;; quick-find.rkt
> ;; Usage: racket quick-find.rkt < pairs.txt
> 
> #lang racket/base
> 
> (require racket/list)
> (require racket/string)
> (require profile)
> 
> (module+ main (profile-thunk main))
> 
> (define (main)
>  (let* ([n (string->number (read-line))]
>         [set (make-qf-set n)])
>    (for ([line (in-lines)])
>      (let* ([pair (string-split line " " #:trim? #f)]
>             [p (string->number (car pair))]
>             [q (string->number (cadr pair))])
>        (unless (qf-connected? set p q)
>          (qf-connect! set p q)
>          (printf "connected ~a to ~a\n" p q))))
>    (printf
>     "Quick-find Set with ~a connected components: ~a\n"
>     (qf-count set)
>     (string-join
>      (for/list ([n (in-vector (qf-items set))])
>        (number->string n))
>      #:before-first "[" #:after-last "]"))))
> 
> (struct qf-set ([count #:mutable] items) #:transparent)
> (define (make-qf-set n) (qf-set n (make-qf-items n)))
> (define (make-qf-items n) (list->vector (range n)))
> 
> (define (qf-connected? set p q)
>  (= (qf-find set p)
>     (qf-find set q)))
> 
> ;; VERY SLOW - it takes over 4 minutes to finish
> ;; (define (qf-connect! set p q)
> ;;   (let ([p-root (qf-find set p)]
> ;;         [q-root (qf-find set q)]
> ;;         [items (qf-set-items set)])
> ;;     (unless (= p-root q-root)
> ;;       (for ([(v i) (in-indexed items)])
> ;;         (when (= v p-root)
> ;;           (vector-set! items i q-root)))
> ;;       (set-qf-set-count! set (- (qf-set-count set) 1)))))
> 
> ;; better - it takes about 28s to finish
> ;; but still slower than the Golang version
> (define (qf-connect! set p q)
>  (let* ([p-root (qf-find set p)]
>         [q-root (qf-find set q)]
>         [items (qf-set-items set)]
>         [n (vector-length items)])
>    (unless (= p-root q-root)
>      (for ([i (in-range n)] ;; it is weird that adding this line slows down the program by about 10s
>            [v (in-vector items)])
>        (when (= v p-root) (vector-set! items i q-root)))
>      (set-qf-set-count! set (- (qf-set-count set) 1)))))
> 
> (define (qf-find set p)
>  (let ([items (qf-set-items set)])
>    (if (p . < . (vector-length items))
>      (vector-ref items p)
>      #f)))
> 
> (define qf-count qf-set-count)
> (define (qf-items set)
>  (vector->immutable-vector (qf-set-items set)))
> ____________________
>  Racket Users list:
>  http://lists.racket-lang.org/users



Posted on the users mailing list.