[racket] place: terrible performance of place-channel-get?

From: Alexey Cherkaev (alexey.cherkaev at gmail.com)
Date: Thu Nov 13 02:15:57 EST 2014

Hi Matthew,

Thanks a lot for the reply!

This works! Thanks a lot.

Regards,
Alexey


On 12 Nov 2014, at 18:52, Matthew Flatt <mflatt at cs.utah.edu> wrote:

> I'll push a repair to the development version.
> 
> 
> The problem isn't so much that message copying/transfer is slow, but
> that the rule to trigger an all-places GC doesn't accommodate a large,
> not-yet-delivered message. I'll repair that rule.
> 
> Most of the process time in your example shows up as GC time, because
> the GC was continuously firing while the message waited for the new
> place to start and receive it (and the constant GCs slowed the place
> start-up).
> 
> 
> If upgrading is not an option, you can work around the problem by
> waiting for a "ready" message from the new place before sending the
> vector as a message. For example, change `test-place1` to
> 
> (define (test-places1)
>   (define p1
>     (place ch1
>            (place-channel-put ch1 'ready)
>            (define v (place-channel-get ch1))
>            (define w (long-computation v))
>            (place-channel-put ch1 w)))  
>   (place-channel-get p1) ; => 'ready
>   (place-channel-put p1 v1)
>   (time (place-channel-get p1)))
> 
> That way, `v1` doesn't sit in the message channel long enough to cause
> a problem.
> 
> At Tue, 11 Nov 2014 17:41:11 -0700, Matthew Flatt wrote:
>> This does seem extremely slow. A place-message send must copy the
>> vector to send it as a message, but the copy shouldn't take so long.
>> I'll investigate further.
>> 
>> Meanwhile, an option in this case might be to created a "shared
>> flvector", which can be passed directly (i.e., without copying) to
>> another place. I've enclosed a variant of your example to illustrate.
>> 
>> At Mon, 10 Nov 2014 11:58:21 +0200, Alexey Cherkaev wrote:
>>> Hi,
>>> 
>>> I am looking at parallelising some numerical computation with Racket. I’ve 
>>> tried future/touch first. However, the data for computation is passed as 
>>> vectors and in my experiments with future/touch it would always find 
>>> "synchronisation task” upon which all multicore-threads collapse into one 
>> core 
>>> serialised computation.
>>> 
>>> Now, I decided to try place. My idea is to make it similar to Common Lisp’s 
>>> LPARALLEL: create workers <= number of cores and distribute tasks into those 
>>> workers. The problem I have encountered, however, is that place-channel-get 
>>> seems to take forever to compute. Here is an example of some simulated 
>>> computation on a vector using two places and trying to run them in parallel:
>>> 
>>> #lang racket
>>> 
>>> (require racket/place)
>>> 
>>> (provide test-places1 test-places2 long-computation v1 v2 random-vector)
>>> 
>>> ;;; Utilities: 
>>> (define (random-list n)
>>>  (let loop ((i n) (r '()))
>>>    (if (zero? i)
>>>        r
>>>        (loop (sub1 i) (cons (random) r)))))
>>> 
>>> (define (random-vector n)
>>>  (let ((l (random-list n)))
>>>    (list->vector l)))
>>> 
>>> (define (vector-reduce f init v)
>>>  (let ((n (vector-length v)))
>>>    (let loop ((i 0) (r init))
>>>      (if (= i n)
>>>          r
>>>          (loop (add1 i) (f r (vector-ref v i)))))))
>>> 
>>> ;;; This is  computation to be run in each place:
>>> (define (long-computation v)
>>>  (let ((n (vector-length v))
>>>        (v1 (vector-copy v)))  ; v is immutable, if want to mutate, must copy 
>> it
>>>    (let loop ((i 0))
>>>      (if (= i n)
>>>          (begin
>>>            (sleep 2)         ; make it work for a bit longer
>>>            (vector-reduce + 0.0 v1)) ; to make result printable
>>>          (begin
>>>            (vector-set! v1 i (* (exp (- (vector-ref v1 i)))
>>>                                 (sin (* pi (vector-ref v1 i)))))   ;flonum 
>>> computation
>>>            (loop (add1 i)))))))
>>> 
>>> ;;; two vectors to be sent to long-computation
>>> (define v1 (random-vector 100000))
>>> (define v2 (random-vector 100000))
>>> 
>>> ;;; Test using one place:
>>> (define (test-places1)
>>>  (define p1
>>>    (place ch1
>>>           (define v (place-channel-get ch1))
>>>           (define w (long-computation v))
>>>           (place-channel-put ch1 w)))  
>>>  (place-channel-put p1 v1)
>>>  (time (place-channel-get p1)))
>>> 
>>> ;;; Test using 2 places:
>>> (define (test-places2)
>>>  (define p1
>>>    (place ch1
>>>           (define v (place-channel-get ch1))
>>>           (define w (long-computation v))
>>>           (place-channel-put ch1 w)))
>>>  (define p2
>>>    (place ch2
>>>           (define v (place-channel-get ch2))
>>>           (define w (long-computation v))
>>>           (place-channel-put ch2 w)))
>>>  (place-channel-put p1 v1)
>>>  (place-channel-put p2 v2)
>>>  (sleep 2) ; hypothetically, after this results shoud be ready immidiately!
>>>  (time (list (place-channel-get p1) (place-channel-get p2))))
>>> 
>>> Exectution from racket on MacBook Pro with Intel Core 2 Duo:
>>> 
>>> -> (time (long-computation v1))
>>> cpu time: 42 real time: 2043 gc time: 0
>>> 39523.12275516648
>>> -> (test-places1)
>>> cpu time: 7593 real time: 7475 gc time: 7001
>>> 39523.12275516648
>>> -> (test-places2)
>>> cpu time: 16591 real time: 12492 gc time: 15485
>>> '(39523.12275516648 39505.415738171105)
>>> 
>>> So, the time of execution of (long-computation v1) and the time of getting 
>> the 
>>> result out of the channel in (test-places1) should be more or less the same, 
>>> but it is not. Furthermore, (test-places2) takes almost twice as 
>> (test-places1) 
>>> (note, I put (time …) around just getting the value, so it does not include 
>> the 
>>> time of creating the place).
>>> 
>>> Am I doing something wrong?
>>> 
>>> Cheers, Alexey
>>> 
>>> 
>>> ____________________
>>>  Racket Users list:
>>>  http://lists.racket-lang.org/users
>> ------------------------------------------------------------------------------
>> [application/octet-stream "shared-flvector-example.rkt"] [~/Desktop & open] 
>> [~/Temp & open]
>> ____________________
>>  Racket Users list:
>>  http://lists.racket-lang.org/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20141113/e2b19f35/attachment.html>

Posted on the users mailing list.