[plt-scheme] writing to subprocess's stdin and then...
On Jun 17, YC wrote:
> On Mon, Jun 16, 2008 at 10:40 PM, Eli Barzilay <eli at barzilay.org> wrote:
> >
> > * finally, the biggest problem is a conceptual one: you read stuff
> > from the output only after the process has finished -- but what
> > if it spits out a lot of data? In that case, it will not
> > finish, and instead just sit there waiting for you to read that
> > data, and you'll be getting into a very common race condision
> > with subprocesses.
> >
> > What you really need is to do the reading in a thread, so the
> > process can continue running. It might seem strange at first,
> > but when there's a lot of data then *someone* needs to hold it,
> > and the OS will hold only a very small amount (and for good
> > reasons). Your thread will need to do just that accumulation
> > (or it can just to the processing, whatever it is).
>
> After re-reading your example, I think I started to grok what you were doing
> on http://www.cs.brown.edu/pipermail/plt-scheme/2006-February/011953.html:
>
> ...
> (define-values (in out) (make-pipe)) ...
> ...
> (thread (lambda ()
> (copy-port pout out)
> (close-output-port out)
> (subprocess-wait p)))
>
> You first created a pipe for holding the accumulation, and then you
> started a thread to read the data from pout into pipe's out, and
> when out is closed the data gets piped to in (perhaps this is
> happening in the background without you have to close it too?), and
> finally the process exits... correct?
Actually that extra pipe and thread are not strictly needed. Same for
the use of /dev/null -- it can just close the subprocess's input right
after it fires it up. Below is a more compact and 4.0-ized example.
> But shouldn't the ports be closed after subprocess-wait?
You usually want to close the ports input so it will finish, since
many usefule processes (at least on unix) work until their stdin runs
out.
Here's the revised example -- with no use of threads.
| #lang scheme
|
| (require scheme/port)
| (define (with-input-from-subprocess exe thunk)
| (define-values (p pout pin perr)
| (subprocess #f #f (current-error-port)
| (find-executable-path exe)))
| (close-output-port pin)
| (parameterize ([current-input-port pout])
| (begin0 (thunk)
| (subprocess-wait p))))
|
| (with-input-from-subprocess "du"
| (lambda ()
| (for ([line (in-lines)])
| (printf ">> ~s\n" (regexp-split #rx"[/\t]" line)))))
But it still relies on using (current-error-port) for the subprocess's
stderr, which might not be true if this function is called from a
`parameterize'. The mzlib/process code takes care of such cases -- you
can just run
(parameterize ([current-output-port (open-output-bytes)])
(system "du")
(get-output-bytes (current-output-port)))
If you look at that file, you'll see that in this case the code will
make the necessary pipes and a thread to transfer their contents.
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://www.barzilay.org/ Maze is Life!