[racket] web server stopped responding to TCP connections

From: Tony Garnock-Jones (tonyg at ccs.neu.edu)
Date: Mon Dec 29 18:21:40 EST 2014

On 12/29/2014 04:09 PM, George Neuner wrote:
> Over the weekend my webserver application suddenly stopped responding to
> TCP connections.  It was following a large spate of broken connections
> caused by debugging an issue on the browser side.

Great! I, too, have spotted this problem in at least three different
web-server applications. The server continues to run fine other than
accept()-wise, and can make outbound connections and do whatever else it
needs to, but incoming connections just never appear.

Do you have a reliable way of provoking the fault? I don't.

Are you using Racket's SSL support, or is it http-only?

Are you on Linux, a different Unixalike, or Windows?

I have only seen this so far using the web-server, and sadly also only
when SSL is enabled. For all I know, though, it could be something to do
with TCP sockets in general in Racket.

My observations of strace suggest that accept() is never being called on
the server socket when incoming connections appear when the server is in
a faulty state. However, looking at netstat, the connections are all in
CLOSE_WAIT state, with the input from the client buffered and unread by
the server, which suggests the connections *are* being accepted
*somewhere* by the process. Supporting this view is the fact that there
are more such connections than allowed by the `backlog` parameter given
to listen(), so *something* must be accepting them.

> log output to the console so
> I could monitor it, and, stupidly!, the output wasn't tee'd into a
> file.

(Aside: I heartily recommend running services using daemontools. The
"multilog" and "tai64nlocal" programs are great for recording and
viewing log output.)

> Nor does the problem seem to be easily repeatable - I've been trying
> various scenarios but so far I haven't been able to duplicate it.

Oh, bummer.

> I don't even know where to look for the problem.  Does anyone (Jay?)
> know what this is or have a suggestion as to how to diagnose it?

Hopefully others who've had this or related problems speak up too!

Tony


Posted on the users mailing list.