[plt-scheme] interleaving the I and O in I/O
At Sun, 27 Jan 2008 09:36:40 -0500, Prabhakar Ragde wrote:
> Colleagues of mine discovered this issue when automarking simple
> programs written by second-year students. Consider the following program
> for reading numbers from a file and printing the sequence out twice,
> intended to be invoked on the command line with file redirection.
>
> (define (print-list l)
> (cond
> [(null? l) (void)]
> [else (printf "~a\n" (car l))
> (print-list (cdr l))]))
>
> (define (get-input acc)
> (let
> [(i (read))]
> (cond
> [(eof-object? i) (reverse acc)]
> [else (printf "~a\n" i)
> (get-input (cons i acc))])))
>
> (define l (get-input '()))
> ;(print-list l)
> (print-list l)
>
> When I save this in prog.ss on my MacBookPro running v372 and type
>
> % time mzscheme -r prog.ss <bignums.txt >progout.txt
>
> where bignums has 200,000 numbers, I get the following timing information:
>
> real 0m3.775s
> user 0m2.221s
> sys 0m1.550s
>
> If I comment out the (printf ...) in get-input and uncomment the
> (print-list l), the times become:
>
> real 0m0.979s
> user 0m0.915s
> sys 0m0.054s
>
> I'm curious as to what is going on
When you read from the original input port, the original output port is
automatically flushed. So, the first version writes line-by-line, while
the second writes the list in blocks. I think that accounts for the
time difference.
I note that there's another layer of automatic behavior here, which is
that block buffering for output is enabled because output is directed
to a file, as opposed to a tty.
For 4.0, we should reconsider auto-flush of the current output port
when reading from the current input port. Maybe it should only happen
when the output is a tty. It's also awkward that the behavior is tied
to the original ports, but that's another part of the trade-off in
performance and expected behavior, and it will probably stay.
> and what this means for program
> design in general.
The buffering and flushing rules try to balance performance with
expected behavior. In this case, an example of "expected behavior" is
that when you print text to stdout to ask users a question (maybe
without a newline), they see the question and can type an answer.
Maybe the general conclusion is that the balance between performance
and simplicity is rarely easy to find.
Matthew