[racket] debugging core dump - comments appreciated

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Mon May 23 14:51:19 EDT 2011

We just got a native stack trace from core dump of a large-ish but 
short-lived PLT 4.2.5 CGC program that crashes near startup time in rare 
situations.

Before I start really debugging this, I wanted to send out a request 
that someone familiar with PLT internals take a look at the stack trace 
when they get a chance, and let me know if they had any comments.

The process is  Web CGI program that is given almost identical input 
each time.  I think the only differences in input would be in things 
that vary naturally, like timestamps and TCP port numbers.  We have no 
evidence of exhausted memory or other resource.  We were running as CGI 
to rule out possible causes from SCGI and long-running processes.

My first two guesses are that it could be that the application's code is 
doing something unusual[*] that triggers an intermittent bug in PLT, or 
that somehow our C extension is messing up memory this early in the 
process lifetime.  My first suspicion as to unusual is the large 
spaghetti of "require"s, and an "eval" that does its own overlapping 
"require"s, possibly at module load time.  (If this does indeed sound 
like a likely problem, I'd be happy to change it.)

Stack trace below.  My edits are in double square brackets and blank 
lines.  The stack itself looks uncorrupted to me.  What's apparent to 
someone who doesn't know PLT internals is that we get to an 
"integer_to_char" in "scheme_do_eval", which attempts an allocation and 
then barfs in a recursive "GC_clear_stack_inner".

I'm going to start looking in the Scheme and C extension code now for 
everywhere "integer_to_char" might be happening early on in application 
code, but will probably have to come back to PLT internals, so comments 
appreciated.

The first PLT internals question I expect to look into is whether, in 
the stack tace, "require"s are still being processed (by which I mean 
evaluation has not yet proceeded into the non-"require" body of the main 
Scheme file).


---- BEGIN STACK TRACE ----
Core was generated by `/usr/local/plt-4.2.5/bin/mzschemecgc -jqr 
[[PATH_TO_SCHEME_FILE]]'.
Program terminated with signal 11, Segmentation fault.
[[...]]

#0  0x00000000005655b6 in GC_clear_stack_inner (arg=0x0, 
limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:243
#1  0x00000000005655cb in GC_clear_stack_inner (arg=0x0, 
limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:245
#2  0x00000000005655cb in GC_clear_stack_inner (arg=0x0, 
limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:245
[[...SAME...]]
#7  0x00000000005655cb in GC_clear_stack_inner (arg=0x0, 
limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:245
#8  0x00000000005655cb in GC_clear_stack_inner (arg=0x0, 
limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:245
#9  0x00000000005655cb in GC_clear_stack_inner (arg=0x0, 
limit=0x7fff2dd5ce30 <Address 0x7fff2dd5ce30 out of bounds>) at ./misc.c:245

#10 0x000000000056564b in GC_clear_stack (arg=0x0) at ./misc.c:291
#11 0x000000000056db58 in GC_generic_malloc_many (lb=2, k=0, 
result=0x7e7780) at ./mallocx.c:421
#12 0x000000000056e3f5 in GC_malloc_atomic (bytes=16) at 
./thread_local_alloc.c:219

#13 0x000000000055ee7a in scheme_make_char (ch=12985) at ./char.c:146
#14 0x00000000005606a4 in integer_to_char (argc=1, argv=0x7fff2c5b7028) 
at ./char.c:248

#15 0x00000000004452b8 in scheme_do_eval (obj=0x909110, num_rands=1, 
rands=0x7fff2c5b7028, get_value=1) at ./eval.c:8825
#16 0x0000000000444bac in scheme_do_eval (obj=0x908ea0, num_rands=<value 
optimized out>, rands=0x7fff2c5b7030, get_value=1) at ./eval.c:9416
#17 0x00000000004449ac in scheme_do_eval (obj=0x7fff2c5a6ba0, 
num_rands=4, rands=0x7fff2c5b7010, get_value=1) at ./eval.c:9538

#18 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, num_rands=<value 
optimized out>, rands=0x7fff2c5b7058, get_value=1) at ./eval.c:9509
#19 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, num_rands=<value 
optimized out>, rands=0x7fff2c5b7098, get_value=1) at ./eval.c:9509
#20 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, num_rands=<value 
optimized out>, rands=0x7fff2c5b70d8, get_value=1) at ./eval.c:9509
[[...SAME...]]
#173 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff2c5b8dc8, get_value=1) at 
./eval.c:9509
#174 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff2c5b8e00, get_value=1) at 
./eval.c:9509
#175 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff2c5b8e40, get_value=-1) at 
./eval.c:9509

#176 0x000000000044218a in scheme_enlarge_runstack (size=<value 
optimized out>, k=0x4464d0 <do_eval_k>) at ./eval.c:660
#177 0x000000000044546f in scheme_do_eval (obj=0x7fff2c52a7e0, 
num_rands=<value optimized out>, rands=0x7fff2f8985a0, get_value=1) at 
./eval.c:8852

#178 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff3211b0b8, get_value=1) at 
./eval.c:9509
#179 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff3211b0f8, get_value=1) at 
./eval.c:9509
#180 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff3211b118, get_value=1) at 
./eval.c:9509
[[...SAME...]]
#338 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff3211ce20, get_value=1) at 
./eval.c:9509
#339 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff3211ce58, get_value=1) at 
./eval.c:9509
#340 0x0000000000444af7 in scheme_do_eval (obj=0x900dd0, 
num_rands=<value optimized out>, rands=0x7fff3211ce98, get_value=1) at 
./eval.c:9509

#341 0x0000000000444bac in scheme_do_eval (obj=0x7fff2c5910a0, 
num_rands=<value optimized out>, rands=0x7fff3211ced8, get_value=1) at 
./eval.c:9416
#342 0x0000000000444782 in scheme_do_eval (obj=0x7fff2c5a49c0, 
num_rands=-1, rands=0x0, get_value=1) at ./eval.c:9724
#343 0x0000000000444aac in scheme_do_eval (obj=0x7fff2c593820, 
num_rands=<value optimized out>, rands=0x7fff3211cf00, get_value=1) at 
./eval.c:9484
#344 0x0000000000444bac in scheme_do_eval (obj=0x7fff2c58a990, 
num_rands=<value optimized out>, rands=0x7fff3211cf10, get_value=-1) at 
./eval.c:9416
#345 0x00000000005436b3 in define_execute_with_dynamic_state 
(vec=0x7fff2f827560, delta=1, defmacro=0, rp=0x0, dm_env=0x0, 
dyn_state=0x0) at ./syntax.c:710
#346 0x0000000000444d3d in scheme_do_eval (obj=0x7fff2f827300, 
num_rands=-1, rands=0x0, get_value=-1) at ./eval.c:9236
#347 0x00000000004452cd in scheme_do_eval (obj=0x7fff2c5a5fc0, 
num_rands=0, rands=0x0, get_value=-1) at ./eval.c:9171
#348 0x000000000045e4c7 in scheme_finish_apply_for_prompt (prompt=<value 
optimized out>, _prompt_tag=<value optimized out>, proc=0x7fff2c5a5fc0, 
argc=0, argv=0x0) at ./fun.c:6039
#349 0x000000000045e5a1 in scheme_apply_for_prompt 
(prompt=0x7fff2dd5ca20, prompt_tag=0x0, proc=0x6a8, argc=768989080, 
argv=0x0) at ./fun.c:6119
#350 0x0000000000467a10 in call_with_prompt (in_argc=<value optimized 
out>, in_argv=<value optimized out>) at ./fun.c:6444
#351 0x00000000004452b8 in scheme_do_eval (obj=0x900350, num_rands=3, 
rands=0x7fff2dd6fef0, get_value=-1) at ./eval.c:8825
#352 0x0000000000464f09 in do_call_with_prompt (f=<value optimized out>, 
data=<value optimized out>, multi=1, top_level=0) at ./fun.c:6631

#353 0x00000000004af298 in scheme_module_run_finish 
(menv=0x7fff2f756b40, env=0x7fff3210bb40) at ./module.c:4576
#354 0x000000000047ac45 in scheme_module_run_start (menv=0x7fff2dd5ca20, 
env=0x0, name=0x7fff2c58ba80) at ./jit.c:12868
#355 0x00000000004b361f in start_module (m=0x7fff2f7b7880, 
env=0x7fff3210bb40, restart=0, syntax_idx=0x7fff2fc3e780, eval_exp=-1, 
eval_run=1, base_phase=0, cycle_list=0x7fff2c572a40)
    at ./module.c:4381
#356 0x00000000004b55a6 in scheme_module_start_finish (a=<value 
optimized out>) at ./module.c:3968
#357 0x000000000047abe5 in scheme_module_start_start (a=0x7fff2dd5ca20, 
name=0x7fff2c572a60) at ./jit.c:12886
#358 0x00000000004b33ea in start_module (m=0x7fff2fc3b770, 
env=0x7fff3210bb40, restart=0, syntax_idx=0x7fff30066960, eval_exp=-1, 
eval_run=1, base_phase=<value optimized out>,
    cycle_list=0x7fff2c572f00) at ./module.c:4069
#359 0x00000000004b55a6 in scheme_module_start_finish (a=<value 
optimized out>) at ./module.c:3968
#360 0x000000000047abe5 in scheme_module_start_start (a=0x7fff2dd5ca20, 
name=0x7fff2c572f20) at ./jit.c:12886
#361 0x00000000004b33ea in start_module (m=0x7fff3060dee0, 
env=0x7fff3210bb40, restart=0, syntax_idx=0x7fff2f73bba0, eval_exp=-1, 
eval_run=1, base_phase=<value optimized out>,
    cycle_list=0x7fff2c556f40) at ./module.c:4069
#362 0x00000000004b55a6 in scheme_module_start_finish (a=<value 
optimized out>) at ./module.c:3968
#363 0x000000000047abe5 in scheme_module_start_start (a=0x7fff2dd5ca20, 
name=0x7fff2c556f60) at ./jit.c:12886
#364 0x00000000004b33ea in start_module (m=0x7fff309fabb0, 
env=0x7fff3210bb40, restart=0, syntax_idx=0x7fff2fca0e40, eval_exp=-1, 
eval_run=1, base_phase=<value optimized out>,
    cycle_list=0x7fff2c556fc0) at ./module.c:4069
#365 0x00000000004b55a6 in scheme_module_start_finish (a=<value 
optimized out>) at ./module.c:3968
#366 0x000000000047abe5 in scheme_module_start_start (a=0x7fff2dd5ca20, 
name=0x7fff2c556fe0) at ./jit.c:12886
#367 0x00000000004b33ea in start_module (m=0x7fff2fb0b110, 
env=0x7fff3210bb40, restart=0, syntax_idx=0x7fff2fb1bb70, eval_exp=-1, 
eval_run=1, base_phase=<value optimized out>,
    cycle_list=0x7fff2c54ae80) at ./module.c:4069
#368 0x00000000004b55a6 in scheme_module_start_finish (a=<value 
optimized out>) at ./module.c:3968
#369 0x000000000047abe5 in scheme_module_start_start (a=0x7fff2dd5ca20, 
name=0x7fff2c54aea0) at ./jit.c:12886
#370 0x00000000004b33ea in start_module (m=0x7fff2ffcd330, 
env=0x7fff3210bb40, restart=0, syntax_idx=0x7fff2c5316f0, eval_exp=-1, 
eval_run=1, base_phase=<value optimized out>, cycle_list=0x7e7ad8)
    at ./module.c:4069

#371 0x00000000004b3a4e in parse_requires (form=0x7fff3017a380, 
base_modidx=0x7e79f0, main_env=0x7fff3210bb40, for_m=<value optimized 
out>, rn_set=0x7fff2c531720, post_ex_rn_set=0x7fff2c531720,
    ck=0x4b0f90 <check_dup_require>, data=0x0, redef_modname=0x0, 
unpack_kern=1, copy_vars=0, can_save_marshal=0, eval_exp=0, eval_run=0, 
all_simple=0x0) at ./module.c:9526
#372 0x00000000004b4e1e in do_require_execute (env=0x7fff3210bb40, 
form=0x7fff3017a380) at ./module.c:9668
#373 0x00000000004c00f9 in top_level_require_execute (data=<value 
optimized out>) at ./module.c:9684

#374 0x0000000000444d3d in scheme_do_eval (obj=0x7fff2c548480, 
num_rands=-1, rands=0x0, get_value=-1) at ./eval.c:9236
#375 0x0000000000445ec9 in eval_k () at ./eval.c:9942
#376 0x00000000004452cd in scheme_do_eval (obj=0x7fff2c5317b0, 
num_rands=0, rands=0x0, get_value=-1) at ./eval.c:9171
#377 0x000000000045e4c7 in scheme_finish_apply_for_prompt (prompt=<value 
optimized out>, _prompt_tag=<value optimized out>, proc=0x7fff2c5317b0, 
argc=0, argv=0x0) at ./fun.c:6039
#378 0x000000000045e5a1 in scheme_apply_for_prompt 
(prompt=0x7fff2dd5ca20, prompt_tag=0x0, proc=0x6a8, argc=768989080, 
argv=0x0) at ./fun.c:6119
#379 0x0000000000467a10 in call_with_prompt (in_argc=<value optimized 
out>, in_argv=<value optimized out>) at ./fun.c:6444
#380 0x00000000004452b8 in scheme_do_eval (obj=0x900350, num_rands=3, 
rands=0x7fff2dd70d50, get_value=-1) at ./eval.c:8825
#381 0x0000000000464f09 in do_call_with_prompt (f=<value optimized out>, 
data=<value optimized out>, multi=1, top_level=0) at ./fun.c:6631
#382 0x000000000044c646 in compile_k () at ./eval.c:9999
#383 0x00000000004621a4 in scheme_top_level_do_worker (k=0x44c1b0 
<compile_k>, eb=0, new_thread=0, dyn_state=0x0) at ./fun.c:2222
#384 0x0000000000446094 in scheme_default_eval_handler (argc=<value 
optimized out>, argv=0x7fff30042be0) at ./eval.c:10286
#385 0x00000000004452b8 in scheme_do_eval (obj=0x90e9d0, num_rands=1, 
rands=0x7fff30042be0, get_value=-1) at ./eval.c:8825
#386 0x00000000004452cd in scheme_do_eval (obj=0x7fff2ffd6c00, 
num_rands=0, rands=0x0, get_value=-1) at ./eval.c:9171
#387 0x000000000045e4c7 in scheme_finish_apply_for_prompt (prompt=<value 
optimized out>, _prompt_tag=<value optimized out>, proc=0x7fff2ffd6c00, 
argc=0, argv=0x0) at ./fun.c:6039
#388 0x000000000045e5a1 in scheme_apply_for_prompt 
(prompt=0x7fff2dd5ca20, prompt_tag=0x0, proc=0x6a8, argc=768989080, 
argv=0x0) at ./fun.c:6119
#389 0x0000000000467a10 in call_with_prompt (in_argc=<value optimized 
out>, in_argv=<value optimized out>) at ./fun.c:6444
#390 0x00000000004452b8 in scheme_do_eval (obj=0x900350, num_rands=3, 
rands=0x7fff2dd71430, get_value=-1) at ./eval.c:8825
#391 0x0000000000464f09 in do_call_with_prompt (f=<value optimized out>, 
data=<value optimized out>, multi=1, top_level=0) at ./fun.c:6631
#392 0x00000000004e5df7 in do_load_handler (data=<value optimized out>) 
at ./portfun.c:4156
#393 0x000000000045c9f1 in scheme_dynamic_wind (pre=0, act=0x4e5cc0 
<do_load_handler>, post=0x4e5b70 <post_load_handler>, jmp_handler=0, 
data=0x7fff2ffd6d20) at ./fun.c:7937
#394 0x00000000004e5a96 in default_load (argc=<value optimized out>, 
argv=0x7fff2dd717a0) at ./portfun.c:4271
#395 0x00000000004452b8 in scheme_do_eval (obj=0x90eb40, num_rands=2, 
rands=0x7fff2dd717a0, get_value=-1) at ./eval.c:8825
#396 0x00000000004e2a42 in scheme_load_with_clrd (argc=<value optimized 
out>, argv=<value optimized out>, who=<value optimized out>, 
handler_param=10) at ./portfun.c:4311
#397 0x00000000004452b8 in scheme_do_eval (obj=0x7fff2ffcb030, 
num_rands=1, rands=0x7fff2dd71b00, get_value=-1) at ./eval.c:8825
#398 0x00000000004621a4 in scheme_top_level_do_worker (k=0x45a4d0 
<apply_k>, eb=1, new_thread=0, dyn_state=0x0) at ./fun.c:2222
#399 0x00000000004e274f in scheme_load (file=<value optimized out>) at 
./portfun.c:4424
#400 0x00000000004284cc in finish_cmd_line_run (fa=0x7fff320d6810, 
repl=0x428a30 <do_scheme_rep>) at ./cmdline.inc:201
#401 0x0000000000427625 in main_after_stack (data=<value optimized out>) 
at ./cmdline.inc:1137
#402 0x000000000050d69e in scheme_main_stack_setup 
(no_auto_statics=<value optimized out>, _main=0x427010 
<main_after_stack>, data=0x7fff2dd71dc0) at ./salloc.c:195
#403 0x0000000000426e3e in main (argc=<value optimized out>, argv=<value 
optimized out>) at ./main.c:268
---- END STACK TRACE ----

-- 
http://www.neilvandyke.org/



Posted on the users mailing list.