[racket-dev] Suggestions for monitoring unresponsive web server connection?

From: Jay McCarthy (jay.mccarthy at gmail.com)
Date: Sat Sep 10 18:18:05 EDT 2011

I agree with the other things that have been recommended.

As far as the file descriptors go, IIRC the Racket VM uses the "old"
asynchronous I/O system in Linux that limits to FD_SETSIZE files
(which is normally very large, but still a fixed number.) It is
unlikely you are hitting that.

As far as the SSL errors, I see those errors a lot on my HTTPS
servers. Nothing ever seems to be broken. My guess is that it is a
strange interaction by HTTP 1.1 keepalives, SSL, and strange browsers.
I've never noticed performance problems or access problems because of
it though.

As far as logging in general, the default Web server logs mimic
Apache. My attitude is that if you want more extensive diagnostic
information, you can create a dispatcher that collects that
information. For example, you could do something like:

(define (make-logging-wrap inner-dispatch)
 (lambda (conn req)
  (define uniq (gensym))
  (fprintf the-log "~a: ~a: Starting to handle ~a\n" (current-seconds)
uniq (request-info req))
  (inner-dispatch conn req)
  (fprintf the-log "~a: ~a: Done handling ~a\n" (current-seconds) uniq
(request-info req))))

and then use this around your normal dispatcher chain.

The main thing that you wouldn't be able to get at is whether the
connections are being dropped by the OS or if something in
mzlib/etc:run-server isn't working.

Jay

On Wed, Sep 7, 2011 at 4:38 AM, Kathi Fisler <kfisler at gmail.com> wrote:
> Following up on this -- what's the max number of open file descriptors that
> Racket allows? We're seeing some lingering ones (trying to trace the source,
> but wondering if this is the problem).
>
> We are also getting a lot of ssl error reports of the forms
>
> Connection error: ssl-accept/enable-break: accept failed (error:1407609C:SSL
> routines:SSL23_GET_CLIENT_HELLO:http request)
>
> Connection error: ssl-accept/enable-break: accept failed (input terminated
> prematurely)
>  === context ===
> /home/turnin/plt-5.1.1/lib/racket/collects/openssl/mzssl.rkt:919:10: loop
> /home/turnin/plt-5.1.1/lib/racket/collects/mzlib/thread.rkt:69:12: loop
>
> We're not doing any manual SSL operations (just the ssl settings to the web
> server startup).  Still, are these traceable to something in our code?
>
> thanks,
> Kathi
>
> On Tue, Sep 6, 2011 at 8:13 PM, Kathi Fisler <kfisler at gmail.com> wrote:
>> Our homework submission application, currently running under 5.1.1
>> with Jay's stateless infrastructure, has been going unresponsive on us
>> a lot lately.  In some cases, the process is still alive but the web
>> server does not respond to requests.  In some cases, the process has
>> died.   While this sometimes happens concurrently with high load, load
>> is neither necessary nor sufficient.
>>
>> Are there commands we can use when we startup racket or the server
>> that might give diagnostics to help trace the problem?
>>
>> We are not getting core dumps (even when the process dies).  So far,
>> monitoring top output hasn't revealed anything unusual or telling.
>>
>> Here's the uname -a listing, in case that is helpful:
>>
>> Linux turnin.cs.wpi.edu 2.6.18-238.12.1.el5 #1 SMP Tue May 31 13:22:04
>> EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
>>
>> thanks,
>> Kathi
>>
>
>
> _________________________________________________
>  For list-related administrative tasks:
>  http://lists.racket-lang.org/listinfo/dev
>



-- 
Jay McCarthy <jay at cs.byu.edu>
Assistant Professor / Brigham Young University
http://faculty.cs.byu.edu/~jay

"The glory of God is Intelligence" - D&C 93



Posted on the dev mailing list.