[racket] place: terrible performance of place-channel-get?

From: Alexey Cherkaev (alexey.cherkaev at gmail.com)
Date: Mon Nov 10 04:58:21 EST 2014


I am looking at parallelising some numerical computation with Racket. I’ve tried future/touch first. However, the data for computation is passed as vectors and in my experiments with future/touch it would always find "synchronisation task” upon which all multicore-threads collapse into one core serialised computation.

Now, I decided to try place. My idea is to make it similar to Common Lisp’s LPARALLEL: create workers <= number of cores and distribute tasks into those workers. The problem I have encountered, however, is that place-channel-get seems to take forever to compute. Here is an example of some simulated computation on a vector using two places and trying to run them in parallel:

#lang racket

(require racket/place)

(provide test-places1 test-places2 long-computation v1 v2 random-vector)

;;; Utilities: 
(define (random-list n)
  (let loop ((i n) (r '()))
    (if (zero? i)
        (loop (sub1 i) (cons (random) r)))))

(define (random-vector n)
  (let ((l (random-list n)))
    (list->vector l)))
(define (vector-reduce f init v)
  (let ((n (vector-length v)))
    (let loop ((i 0) (r init))
      (if (= i n)
          (loop (add1 i) (f r (vector-ref v i)))))))

;;; This is  computation to be run in each place:
(define (long-computation v)
  (let ((n (vector-length v))
        (v1 (vector-copy v)))  ; v is immutable, if want to mutate, must copy it
    (let loop ((i 0))
      (if (= i n)
            (sleep 2)         ; make it work for a bit longer
            (vector-reduce + 0.0 v1)) ; to make result printable
            (vector-set! v1 i (* (exp (- (vector-ref v1 i)))
                                 (sin (* pi (vector-ref v1 i)))))   ;flonum computation
            (loop (add1 i)))))))
;;; two vectors to be sent to long-computation
(define v1 (random-vector 100000))
(define v2 (random-vector 100000))

;;; Test using one place:
(define (test-places1)
  (define p1
    (place ch1
           (define v (place-channel-get ch1))
           (define w (long-computation v))
           (place-channel-put ch1 w)))  
  (place-channel-put p1 v1)
  (time (place-channel-get p1)))

;;; Test using 2 places:
(define (test-places2)
  (define p1
    (place ch1
           (define v (place-channel-get ch1))
           (define w (long-computation v))
           (place-channel-put ch1 w)))
  (define p2
    (place ch2
           (define v (place-channel-get ch2))
           (define w (long-computation v))
           (place-channel-put ch2 w)))
  (place-channel-put p1 v1)
  (place-channel-put p2 v2)
  (sleep 2) ; hypothetically, after this results shoud be ready immidiately!
  (time (list (place-channel-get p1) (place-channel-get p2))))

Exectution from racket on MacBook Pro with Intel Core 2 Duo:

-> (time (long-computation v1))
cpu time: 42 real time: 2043 gc time: 0
-> (test-places1)
cpu time: 7593 real time: 7475 gc time: 7001
-> (test-places2)
cpu time: 16591 real time: 12492 gc time: 15485
'(39523.12275516648 39505.415738171105)

So, the time of execution of (long-computation v1) and the time of getting the result out of the channel in (test-places1) should be more or less the same, but it is not. Furthermore, (test-places2) takes almost twice as (test-places1) (note, I put (time …) around just getting the value, so it does not include the time of creating the place).

Am I doing something wrong?

Cheers, Alexey

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20141110/8ee93f48/attachment-0001.html>

Posted on the users mailing list.