[plt-scheme] intermittent "Connection reset by peer" with web server on mac
At Wed, 09 Jul 2008 10:19:11 -0700, Simon Michael wrote:
> I worked through the systems programming tutorial again last night, with
> mzscheme 4.0.2 on mac osx leopard, and observed intermittent connection
> failures when making concurrent connections. eg this fails pretty often:
>
> ab -n 1000 -c 100 http://localhost:8080/
I can reproduce this on Leopard, so that I fairly regularly see
apr_socket_recv: Connection reset by peer (54)
from `ab'.
Occasionally, I see the error that you reported on the server side:
> === context ===
> /Users/simon/src/serve.ss:36:0: handle
> /Users/simon/src/serve.ss:23:12
>
> regexp-match: expects type <string, byte string, or input port> as 2nd
> argument, given: #<eof>; other arguments were: #rx"^GET (.+)
> HTTP/[0-9]+\\.[0-9]+"
but that seems to be a result of `ab' terminating (due to the other error).
As far as I can tell, the source of the "Connection reset by peer"
problem is actually in the OS:
I reduced the server to just `tcp-accept' (don't read, don't close,
etc.), and I even hacked `tcp-accept' to immediately return after the
accept() call. With those changes, I could get the "Connection reset by
peer" error with `ab -n 40 -c 20'.
But if I replace the 5 passed to `tcp-listen' with 100 or more ---
making the TCP listener "backlog" larger than the number of attempted
concurrent connections --- then I'm unable to trigger the "Connection
reset by peer" error in the original server and `ab' configuration. If
I then raise the `ab' concurrency to 200, the errors come back.
(According to system headers, the maximum backlog value in Leopard is
128, so it doesn't help to pass a larger value to `tcp-listen'.)
I rarely hit OS bugs at this level, and I haven't yet tried to create
small C programs to demonstrate the problem. Still, as far as I can
tell, listen()/accept() is not working right in Mac OS X (I can't
reproduce the problem in Linux), and a workaround is to raise the
backlog value.
Matthew