[plt-scheme] Threading vs. System Threads
On Feb 11, Brent Fulgham wrote:
> [...]
> The conjecture is that the use of OS-level threads for concurrency
> leads to lower performance than Erlang's internal concurrency. I
> like to imagine that MzScheme may share some of these positive
> features (Matthew, please feel free to correct me! :-)
> [...]
In the spirit of hugging threads, I made a quick test. First of all,
a macro that evaluates a function call in parallel:
(define-syntax (parallel stx)
(syntax-case stx ()
((_ x ...)
(with-syntax (((v ...) (generate-temporaries #'(x ...)))
(n (length (syntax->list #'(x ...)))))
#'(let ((sema (make-semaphore 0))
(mutex (make-semaphore 1))
(counter n)
(v #f) ...)
(thread (lambda ()
(set! v x)
(semaphore-wait mutex)
(set! counter (sub1 counter))
(semaphore-post mutex)
(when (zero? counter) (semaphore-post sema))))
...
(semaphore-wait sema)
(v ...))))))
Which immediately popped a question: is there any reason
make-semaphore can't handle negative arguments? In this code, doing
this would mean that this both more elegant and faster version could
be used:
(define-syntax (parallel stx)
(syntax-case stx ()
((_ x ...)
(with-syntax (((v ...) (generate-temporaries #'(x ...)))
(n (length (syntax->list #'(x ...)))))
#'(let ((sema (make-semaphore (- 1 n)))
(v #f) ...)
(thread (lambda ()
(set! v x)
(set! counter (sub1 counter))
(semaphore-post sema)))
...
(semaphore-wait sema)
(v ...))))))
But anyway, continue with a quick test to see that it works:
> (time ((begin (sleep 1) +) (begin (sleep 1) 3) (begin (sleep 1) 4)))
cpu time: 0 real time: 3000 gc time: 0
7
> (time (parallel (begin (sleep 1) +) (begin (sleep 1) 3) (begin (sleep 1) 4)))
cpu time: 0 real time: 1000 gc time: 0
7
And try a lots-of-threads example:
> (define (fib n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2)))))
> (time (fib 24))
cpu time: 143 real time: 144 gc time: 0
46368
> (define (pfib n) (if (<= n 1) n (parallel + (pfib (- n 1)) (pfib (- n 2)))))
> (time (pfib 24))
cpu time: 12953 real time: 13192 gc time: 5468
46368
I'm sure this is much better than a native thread implementation.
To see how many threads are actually running in parallel I modified
this to have a counter that was updated whenever a thread was created
or finished, with another thread that reports that value every
second. And there are problems: during the pfib run, the reporter
thread was pretty much starved -- it would burp every once in a while
but with at most 33 threads. To try and get a little better I added a
(sleep 0) to the parallel macro, but the result was pretty much the
same. Finally I changed it into a (sleep 0.1), and my machine started
crawling (so I just killed the process, I actually don't need a reboot
now). Maybe because with this it was fair enough that all those
processes were actually made? For this kind of stuff, I guess
fairness would be necessary.
Anyay, it's was a cute macro to hack...
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://www.barzilay.org/ Maze is Life!