[racket] why does in-indexed / in-range slow down my program?

From: Danny Yoo (dyoo at hashcollision.org)
Date: Sun Jan 20 19:11:02 EST 2013

> As the profiler shows, the most time-consuming part lies in the for-loop of procedure "qf-connect!", and after a few experiments, I figure it out: it is the "in-indexed" procedure that makes it slow. I replaced "in-indexed" with separate calls to "in-range" and "in-vector" - the running time gets reduced from 4m to 27s. Why?

Hmmm.. Interesting question.  It also seems to be that using:

    (in-naturals)

instead of:

    (in-range n)

improves the performance a bit more: on my system, that single change
cuts down the runtime from about 22 seconds to about 15 seconds.
Interesting.  So we'd have to take a closer look at the implementation
of in-indexed and in-range to be able to better isolate what's taking
the time.


... wait... looking more closely at the algorithm... isn't this
supposed to be a straightforward union-find?  If so, why does there
need to be any vector searching going on here?  That is, that step in
qf-connect! that's walking all the elements in the vector.  Why in the
world is it doing that?

... I don't understand what this is computing yet... reading the code
a few more times...

... ok, I think I understand what this is doing now.  But it can be
greatly improved.

The representation that you have chosen for the elements of the set
are bare numbers.  However, I think you want to have more structure:
for example, to be able to immediately jump to the representative.
That way, you avoid hunting through every vector element to find the
representative, which is currently the dominating factor of your
program.

That is, it's certainly useful to optimize in-vector vs. in-indexed
vs. in-naturals, but maybe it's a better idea to avoid iteration
altogether!


I already have an implementation of the standard union-find algorithm
on PLaneT that you can use:

    http://planet.racket-lang.org/package-source/dyoo/union-find.plt/1/0/


Here's what the heart of your program should look like with it:

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
#lang racket/base

(require racket/list)
(require racket/string)
(require profile
         (planet dyoo/union-find/union-find))

(module+ main (profile-thunk main))

(define (main)
  (let* ([n (string->number (read-line))]
         [set (make-qf-set n)])
    (for ([line (in-lines)])
      (let* ([pair (string-split line " " #:trim? #f)]
             [p (string->number (car pair))]
             [q (string->number (cadr pair))])
        (unless (qf-connected? set p q)
          (qf-connect! set p q))))

    ;; TODO: do the output
    (printf "Number of components: ~s\n" (qf-count set))
    ))

(struct qf-set ([count #:mutable] forest) #:transparent)
(define (make-qf-set n) (qf-set n (make-qf-forest n)))
(define (make-qf-forest n)
  (define a-forest (make-forest 'equal))
  (for ([i (in-range n)]) (make-set a-forest i))
  a-forest)


;; qf-connected?: qf-set number number -> boolean
(define (qf-connected? set p q)
  (= (qf-find set p)
     (qf-find set q)))


;; better - it takes about 28s to finish
;; but still slower than the Golang version
(define (qf-connect! set p q)
  (let* ([p-root (qf-find set p)]
         [q-root (qf-find set q)]
         [forest (qf-set-forest set)])
    (unless (= p-root q-root)
      (union-set forest p-root q-root)
      (set-qf-set-count! set (sub1 (qf-set-count set))))))


;; qf-find: qf-set number -> number
(define (qf-find set p)
  (define forest (qf-set-forest set))
  (find-set forest p))

(define qf-count qf-set-count)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



It takes about 200 milliseconds to run to completion on my system.


Posted on the users mailing list.