[racket] can custodian-shutdown-all interrupt an #:atomic callback?

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sat Sep 10 15:44:35 EDT 2011

At Sat, 10 Sep 2011 12:00:19 -0700, John Clements wrote:
> 
> On Sep 9, 2011, at 5:59 PM, Matthew Flatt wrote:
> 
> > At Fri, 9 Sep 2011 17:39:38 -0700, John Clements wrote:
> >> It looks to me as though when a custodian-shutdown-all occurs, the
> >> custodian actions associated using scheme_add_managed() can be
> >> triggered while another thread is in the middle of a callback to an
> >> FFI-generated callback that includes the #:atomic declaration. Is
> >> this true? If true, is it only true for some certain set of
> >> (avoidable) circumstances?
> > 
> > The possibility that comes to mind is a custodian shutdown triggered by
> > a memory-use limit. It looks easy and sensible to disallow that.
> > 
> > Could that be what you saw, or should I look for other possibilities?
> 
> I believe it was a custodian-shutdown-all on a user thread triggered by 
> clicking "run" in DrRacket.
> 
> However, looking more carefully at the stack trace below, I think I've misread 
> it.
> 
> This dump came from a DrRacket hang (hence the SIGQUIT), and based on my 
> reading of Apple CoreAudio documentation, it looks to me like the call in 
> frame 11 of thread 0 to > is waiting on a mutex which I'm guessing 
> is held by thread 10, which I believe is the main audio processing thread.  
> The action in thread 0 is triggered by a custodian action, as shown by frame 
> 27 of that thread.
> 
> Thread 10 is blocked, though. I initially believed it was in the middle of a 
> callback, but looking more carefully, it now seems to me that it was blocked 
> on a queue_callback, which surprises me, because I didn't think that a 
> queue-callback could block. I suppose there has to be *some* synchronization 
> in there, though.
>
> If I'm reading this correctly, then the custodian-ness could be a red
> herring; it could be that any racket thread calling CloseStream or
> another audio function could cause this problem. If queue_callback
> can block, I'm not sure how to resolve this.

To double-check my understanding:

 * A callback uses `#:async-apply' and is invoked on OS thread 10,
   which is waiting to send the callback to OS thread 0 where it can
   run.

 * OS thread 0 is busy trying to shut down the audio process that
   locked by OS thread 10 during the callback (so it's stuck and not
   checking for any queued callbacks to run).

Without knowing more about the API, my only idea is that maybe the
close function could run in its own OS-level thread, although that
sounds difficult to set up.



Posted on the users mailing list.