[racket] Racket and concurrency

From: Rüdiger Asche (rac at ruediger-asche.de)
Date: Tue Jul 3 16:07:30 EDT 2012

>
> 2) Is it better to do the concurrency at the highest level possible or
> at the lowest level possible. I.e. should I be processing pages
> concurrently or should I go to a much lower level and only be
> processing letters concurrently.  Does it matter?
>
> 3)  How does hyperthreading affect the number of places or futures I
> can run concurrently? For example if I have an i7 with 4 cores and
> hyperthreading, will that run 4 or 8 places concurrently?
>

actually, a lot of it depends on two things: a) whether your computations 
are I/O or CPU bound (in the case of OCR, I assume, they are CPU bound) and 
b) whether you benefit at all from any "short" computation being completed 
prior to when it would have been its turn in "linear" computation.

If all your threads are CPU bound and your task isn't completed before the 
last bit of it is completed - all the concurrency you can get ouf of it is 
the parallelism within several processors (and of course both the operating 
system and language support of using those processors efficiently). In the 
"worst case," you only have one processor and there isn't a point in doing 
anything before your entire document is processed - if that's the case, 
parallelizing your computations actually is your *worst* option because the 
total number of CPU cycles needed to do the whole jobs doesn't change 
regardless of how you split up the compuations, but in the parallel version, 
there is context switching overhead! In that scenario, you might as well 
(even better) launch one batch file to do all pieces in turn, go to sleep 
and come back next morning to see the task done.

If, however, there are a lot of "gaps" a particular task has - meaning that 
an I/O system or a coprocessor relieve(s) the CPU(s) periodically - 
concurrent programming can help you utilize the CPU(s) a 100% while filling 
in the gaps with other computations, thus "squeezing together" the 
computations.

Yet another story is when you can benefit from individual computations being 
computed as fast as possible. Assume, for example, that some GUI frontend 
can offer to display an individual page as soon as it (the page) was 
processed completly for the user to scrutinize - in that case, needless to 
say, you don't want a 100M document to clog up the CPU while a dozen very 
small documents would be ready in no time flat so the person in front of the 
GUI could already start working on the small documents. In that case, 
parallelizing would make sense even if every single task would take up 100% 
CPU cycles (thus paying context swith overhead) because the small 
computations are perceived to be completed faster. So the choice isn't black 
and white but instead depends very strongly on the task set.

I hope I didn't state the obvious; all of the above (aside of course from 
not being specific to racket at all) describes fairly elementary concurrent 
application design. Apologies if I should have assessed you on too 
elementary a level.

> 4) Are there any "gotcha's" I need to look out for?
>

I don't know if anyone has computed the costs of concurrency in Racket yet - 
I assume that whereever possible (eg on Windows where fibers are available) 
the concurrency layers have been tailored to suit whatever the underlying 
hard- and software offers, but be aware that concurrency never comes free. 
So do run some elementary tests on the cost of concurrency - eg, (time) both 
single threaded and multithreaded like computations and see what the 
overhead is before you dive into concurrent programming. Also (and this, 
too, is rather elementary, and again, please don't take it as an insult if I 
state what may sound trivial) keep in mind that concurrent software desing 
is one of the outer space limits of computation still - one of the things 
that won't work if you don't do it good enough but likewise won't work if 
you do it too well. Synchronization problems are among the worst to debug 
and trace because they are rarely reproducable and tend to manifest 
themselves differently each time they occur.


> Thanks,
> Harry
> ____________________
>  Racket Users list:
>  http://lists.racket-lang.org/users
> 


Posted on the users mailing list.