[racket-dev] regexp.c and lookahead
At Sat, 14 Jun 2014 18:18:05 -0400, Tony Garnock-Jones wrote:
> At the moment, when regexp.c runs out of buffered lookahead during a
> regexp-try-match, it peeks a few bytes. However, it looks like it will
> never peek *fewer* than 16 bytes (unless eof occurs before then).
I don't think that's right:
(define-values (i o) (make-pipe))
(write-bytes #"abcd" o) ; note: `o` is not closed
(regexp-try-match #rx"^a" i)
; => '(#"a")
Internally the regexp-matching functions call
scheme_get_byte_string_unless() with a 6th argument of 1, which
corresponds to `peek-bytes-avail!`.
The call will request at least 16 bytes on each peek, but the matcher
will accept a single byte to try to make progress.
> I have written the package "incremental-input" which lets a blocking
> read (e.g. read-json) be fed input as it becomes available, event-style.
>
> When testing using read-json from the "json" collect, I find that it
> blocks unnecessarily even though a complete input is available.
I think the problem is in your port implementation. Your
`incremental-read-bytes!` tries to block (and emit a message) instead
of returning a result to indicate that no more input is ready, and that
doesn't work in larger combinations. Since you don't supply a "peek"
function for the port, the immediate combination is that your port's
"read" function is is used to implement peeks. A port's read function
really needs to be non-blocking.
You can make your tests pass most of the time(!) by changing
[(queue-empty? ports)
(suspend)
(retry)]
to
[(queue-empty? ports)
(cond
[(zero? (random 100))
(suspend)
(retry)]
[else 0])]
and that's obviously a hack, but it should illustrate that regexp
matching can be happy to work with the bytes that it has been given ---
if a port properly reports that no more bytes are available.
I'm not sure I understand your overall goal, but it seems like you're
tying to implement `read-json-evt` in terms of `read-json`, or more
generally implement R`-evt` in terms of R. Is there a reason you can't
just call R in a separate thread and wrap that attempt up as an event?
I also suspect that you want a poll operation that reliably fails if
progress is not possible until something more is done externally.
Racket's concurrency system supports that concept; see
`poll-guard-evt`.