[racket] Remote execution in Racket

From: Eli Barzilay (eli at barzilay.org)
Date: Wed Aug 25 10:38:28 EDT 2010

Roughly what Noel said, but with an addition.

IIUC, the intention of `execute' is to run stuff in some known
context, where this is sometimes a temporary directory, and sometimes
a batch job run on your cluster.  So it looks easy to generalize it
(for example, add some keyword) to do either kind of execution.  The
plain execute would be a macro that does what it does now, and the
cluster mode execute would instead take its body and create a racket
file that has just that to be batch-executed.  (And the created body
could be in a slightly different language where the default `execute'
is either a no-op, or the temporary thing, depending on what you want
in that case.)

On Aug 25, Frederick Ross wrote:
> On Tue, Aug 24, 2010 at 7:08 PM, Richard Cleis <rcleis at mac.com> wrote:
> > Can you provide an example? (At this point, one of the profs around here
> > asserts, "Code, please." :)
> 
> You're right, code will clarify this.  I should have started out
> asking what people would recommend to do something.  My users end up
> writing things that look like:
> 
> ~~~
> #lang s-exp bein
> 
> (execute
>   (run "/bin/touch" "boris")
>   (import "boris")
>   (run-binding ((a 'stdout)) "/bin/echo" "This is a test")
>   (import "This is a test"))
> ~~~
> 
> execute sets up a temporary directory, and each command inside works
> therein.  run is a wrapper around subprocess that does a bunch of
> behind the scenes work.  import pulls a file from the temporary
> directory into a managed repository (stored in an SQLite database in
> the user's home directory).
> 
> execute is nestable as well.  For instance,
> 
> ~~~
> (define (f x) (execute (run "/bin/touch" x)))
> (execute
>   (f "boris")
>   (import "boris"))
> ~~~
> 
> only sets up a single temporary directory.  The execute in the
> function f is flattened into the outermost enclosing execute.
> 
> This has to run on a cluster.  The cluster consists of a couple of
> frontend machines users log into plus fifty or sixty nodes which do
> nothing but run batch jobs.  The batch jobs are submitted via the
> commandline or via a C API, and generally look like
> 
> $ bsub /path/to/program arg1 arg2 ...
> 
> Basically, it takes shell commands.  The C API does as well, but takes
> an argument of type char**.  I want to write a version of execute that
> submits its body as a batch job via LSF.  Once the submission happens
> it need not maintain any communication with the original process,
> since the disk (and thus the SQLite database that maintains state) is
> shared across the cluster.
> 
> I would also like to be able to write things like
> 
> ~~~
> (define (f x) (execute-lsf ...stuff with x...))
> (map f '(list of things to apply f to)
> ~~~
> 
> Someone pointed out serializable continuations in the stateless
> servlets code to me, which seem like just about the perfect solution,
> but they seem to be tied into that framework.
> 
> Any suggestions would be very welcome.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!


Posted on the users mailing list.