[racket] segmentation fault while compiling mzscheme 4.2.5 on Xen environment

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Fri Jun 25 13:10:01 EDT 2010

At Fri, 18 Jun 2010 15:10:46 -0400, Danny Yoo wrote:
> I'm seeing the following error when trying to install mzscheme on a
> Xen-hosted virtual environment.

As far as I can tell, the problem is some issue with Xen and
thread-local variables at the OS level.

The crash happens on the last instruction in the following sequence:

 0x080c3080 <scheme_jit_longjmp+0>:	push   %ebp
 0x080c3081 <scheme_jit_longjmp+1>:	mov    %esp,%ebp
 0x080c3083 <scheme_jit_longjmp+3>:	push   %edi
 0x080c3084 <scheme_jit_longjmp+4>:	push   %esi
 0x080c3085 <scheme_jit_longjmp+5>:	push   %ebx
 0x080c3086 <scheme_jit_longjmp+6>:	sub    $0xc,%esp
 0x080c3089 <scheme_jit_longjmp+9>:	mov    0x8(%ebp),%eax
 0x080c308c <scheme_jit_longjmp+12>:	mov    0x81ec174,%ebx
 0x080c3092 <scheme_jit_longjmp+18>:	mov    0x9c(%eax),%edi
 0x080c3098 <scheme_jit_longjmp+24>:	mov    %gs:0x2d8(%ebx),%eax
 0x080c309f <scheme_jit_longjmp+31>:	test   %eax,%eax
 0x080c30a1 <scheme_jit_longjmp+33>:	je     0x80c30f8
 0x080c30a3 <scheme_jit_longjmp+35>:	lea    0xd(%eax),%edx
 0x080c30a6 <scheme_jit_longjmp+38>:	mov    %gs:0x0,%esi
 0x080c30ad <scheme_jit_longjmp+45>:	mov    %edx,%eax
 0x080c30af <scheme_jit_longjmp+47>:	shl    $0x4,%eax
 0x080c30b2 <scheme_jit_longjmp+50>:	mov    %gs:0xc(%eax,%ebx,1),%ecx

That instruction is the access

 stack_cache_stack[stack_cache_stack_pos].stack_frame

where both `stack_cache_stack' and `stack_cache_stack_pos' are
thread-local variables. Accessing the former works fine (earlier in the
sequence above), and it produces the right value. So it makes no sense
that accessing from a statically-allocated array would fail.

Google turns up Xen-related advice about changing "/lib/tls" to
"/lib/tls.disabled", other pages about how you shouldn't have to do
that anymore, and so on. For now, I don't have any explanation for the
behavior that you're seeing other than the possibility that something
about thread-local storage doesn't work right at the Xen/OS level.
Advice from anyone who knows more would be welcome.

Meanwhile, configure with `--disable-futures' to disable the use of
thread-local variables in the run-time system, and then the build seems
to work ok.



Posted on the users mailing list.