[plt-scheme] Threading vs. System Threads

From: Eli Barzilay (eli at barzilay.org)
Date: Wed Feb 12 03:10:33 EST 2003

On Feb 11, Brent Fulgham wrote:
> [...]
> The conjecture is that the use of OS-level threads for concurrency
> leads to lower performance than Erlang's internal concurrency.  I
> like to imagine that MzScheme may share some of these positive
> features (Matthew, please feel free to correct me! :-)
> [...]

In the spirit of hugging threads, I made a quick test.  First of all,
a macro that evaluates a function call in parallel:

  (define-syntax (parallel stx)
    (syntax-case stx ()
      ((_ x ...)
       (with-syntax (((v ...) (generate-temporaries #'(x ...)))
                     (n (length (syntax->list #'(x ...)))))
         #'(let ((sema (make-semaphore 0))
                 (mutex (make-semaphore 1))
                 (counter n)
                 (v #f) ...)
             (thread (lambda ()
                       (set! v x)
                       (semaphore-wait mutex)
                       (set! counter (sub1 counter))
                       (semaphore-post mutex)
                       (when (zero? counter) (semaphore-post sema))))
             ...
             (semaphore-wait sema)
             (v ...))))))

Which immediately popped a question: is there any reason
make-semaphore can't handle negative arguments?  In this code, doing
this would mean that this both more elegant and faster version could
be used:

  (define-syntax (parallel stx)
    (syntax-case stx ()
      ((_ x ...)
       (with-syntax (((v ...) (generate-temporaries #'(x ...)))
                     (n (length (syntax->list #'(x ...)))))
         #'(let ((sema (make-semaphore (- 1 n)))
                 (v #f) ...)
             (thread (lambda ()
                       (set! v x)
                       (set! counter (sub1 counter))
                       (semaphore-post sema)))
             ...
             (semaphore-wait sema)
             (v ...))))))

But anyway, continue with a quick test to see that it works:

  > (time ((begin (sleep 1) +) (begin (sleep 1) 3) (begin (sleep 1) 4)))
  cpu time: 0 real time: 3000 gc time: 0
  7
  > (time (parallel (begin (sleep 1) +) (begin (sleep 1) 3) (begin (sleep 1) 4)))
  cpu time: 0 real time: 1000 gc time: 0
  7

And try a lots-of-threads example:

  > (define (fib n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2)))))
  > (time (fib 24))
  cpu time: 143 real time: 144 gc time: 0
  46368
  > (define (pfib n) (if (<= n 1) n (parallel + (pfib (- n 1)) (pfib (- n 2)))))
  > (time (pfib 24))
  cpu time: 12953 real time: 13192 gc time: 5468
  46368

I'm sure this is much better than a native thread implementation.

To see how many threads are actually running in parallel I modified
this to have a counter that was updated whenever a thread was created
or finished, with another thread that reports that value every
second.  And there are problems: during the pfib run, the reporter
thread was pretty much starved -- it would burp every once in a while
but with at most 33 threads.  To try and get a little better I added a
(sleep 0) to the parallel macro, but the result was pretty much the
same.  Finally I changed it into a (sleep 0.1), and my machine started
crawling (so I just killed the process, I actually don't need a reboot
now).  Maybe because with this it was fair enough that all those
processes were actually made?  For this kind of stuff, I guess
fairness would be necessary.

Anyay, it's was a cute macro to hack...

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the users mailing list.