[racket-dev] posting to semaphore from C causes seg fault

From: John Clements (clements at brinckerhoff.org)
Date: Wed Sep 14 03:14:33 EDT 2011

I'm unable to pass a semaphore to C and post to it from there. In particular, it causes a seg fault. I'm testing the Scheme_Object * with SCHEME_SEMAP, so I'm pretty sure it's a semaphore. Also, I can see this happen in gdb, but the code is optimized, so it's hard to see exactly where it's failing. The semaphore object looks like this in gdb:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000008
[Switching to process 1825]
scheme_post_sema (o=0x104a14668) at sema.c:284
284	
(gdb) l    
279	
280	void scheme_post_sema(Scheme_Object *o)
281	{
282	  Scheme_Sema *t = (Scheme_Sema *)o;
283	  int v, consumed;
284	
285	  if (t->value < 0) return;
286	
287	  v = t->value + 1;
288	  if (v > t->value) {
(gdb) p t
$1 = (Scheme_Sema *) 0x104a14668
(gdb) p t->value
$2 = 0
(gdb) p v
Unable to access variable "v"
$5 = <variable optimized away by compiler>
(gdb) p *t
$6 = {
  so = {
    type = 78, 
    keyex = 0
  }, 
  first = 0x0, 
  last = 0x0, 
  value = 0
}

The strange thing here is that the C code for scheme_sema_post suggests that when t->first is 0x0, it should just silently return. Okay, so I dug into the assembly a bit more, and it turns out that the compiled version of this code looks like this:

Dump of assembler code for function scheme_post_sema:
0x000000010020e0d0 <scheme_post_sema+0>:	push   %rbp
0x000000010020e0d1 <scheme_post_sema+1>:	mov    %rsp,%rbp
0x000000010020e0d4 <scheme_post_sema+4>:	push   %r14
0x000000010020e0d6 <scheme_post_sema+6>:	push   %r13
0x000000010020e0d8 <scheme_post_sema+8>:	push   %r12
0x000000010020e0da <scheme_post_sema+10>:	push   %rbx
0x000000010020e0db <scheme_post_sema+11>:	sub    $0x30,%rsp
0x000000010020e0df <scheme_post_sema+15>:	mov    %rdi,-0x28(%rbp)
0x000000010020e0e3 <scheme_get_thread_local_variables+0>:	lea    0x104cce(%rip),%r13        # 0x100312db8 <scheme_thread_local_offset>
0x000000010020e0ea <scheme_get_thread_local_variables+7>:	mov    0x0(%r13),%edx
0x000000010020e0ee <scheme_get_thread_local_variables+11>:	lea    0x12434b(%rip),%r14        # 0x100332440 <scheme_thread_local_key>
0x000000010020e0f5 <scheme_get_thread_local_variables+18>:	mov    (%r14),%eax
0x000000010020e0f8 <scheme_get_thread_local_variables+21>:	addr32 mov %gs:(%edx,%eax,8),%rdx
-- IT CRASHES ON THIS NEXT INSTRUCTION: --
0x000000010020e0fe <scheme_post_sema+46>:	mov    0x8(%rdx),%rax
0x000000010020e102 <scheme_post_sema+50>:	mov    %rax,-0x50(%rbp)
0x000000010020e106 <scheme_post_sema+54>:	lea    -0x50(%rbp),%rax
0x000000010020e10a <scheme_post_sema+58>:	mov    %rax,0x8(%rdx)
0x000000010020e10e <scheme_post_sema+62>:	lea    -0x28(%rbp),%rax
0x000000010020e112 <scheme_post_sema+66>:	mov    %rax,-0x40(%rbp)
0x000000010020e116 <scheme_post_sema+70>:	mov    0x18(%rdi),%rdx
0x000000010020e11a <scheme_post_sema+74>:	test   %rdx,%rdx

The problem on the given instruction is that %rdx is 0, and thus that loading from an offset of 8 from 0x0 seg faults.

The gdb info makes it look as though this is an inlining of a function called scheme_get_thread_local_variables, though I can't see why it would be called here; the C code looks like it should just increment the counter and return.

As I said, this is completely and totally reproducible, so I'm happy to carry out any experiments; at this point, I'm at the throwing up my hands and saying "compiler bug?" stage.

Many thanks for any suggestions,

John

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4624 bytes
Desc: not available
URL: <http://lists.racket-lang.org/dev/archive/attachments/20110914/27141358/attachment.p7s>

Posted on the dev mailing list.