[racket-dev] Optional equality predicate for assoc and member

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Sun Apr 24 14:53:32 EDT 2011

Why is the turtle on the stack in the gcc version?

Robby

On Sun, Apr 24, 2011 at 9:44 AM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> I've pushed the change that makes `assoc', `assv', and `assq'
> implemented in Racket, including the new optional argument for `assoc'.
>
> At Fri, 22 Apr 2011 19:52:34 -0400, Eli Barzilay wrote:
>> 7 minutes ago, Matthew Flatt wrote:
>> >
>> > Switching to a Racket implementation costs time, but (with the JIT)
>> > it's not a huge difference --- similar to the cost of switching the C
>> > code from 32-bit to 64-bit mode on my machine. The JIT even manages to
>> > make `assq' slightly faster than C in 64-bit mode.
>>
>> Is there some obvious reason for the huge difference in improvement
>> between the 32 and the 64 bits (Almost twice slower and roughly the
>> same resp.)?
>
> Like Noel, I thought at first it might be memory-bound. Now I think
> it's just that the compiler on my machine (or the way the compiler is
> configured) handles x86_64 not as well as x86.
>
>> > The cost for a non-JIT platform is why I hadn't changed `assq',
>> > `assv', and `assoc' before. But a plain-Racket implementation is
>> > surely better in the long run, and we can keep the C code and bind
>> > functions conditionally when the JIT is disabled. Maybe we can
>> > eventually close the gap between the JIT-generated code and C. Also,
>> > I think the JIT could be smarter about `equal?', which could tilt
>> > the balance in favor of the JIT for `assoc'.
>>
>> (Maybe it'll be feasible to dump jit-generated machine code at some
>> point...?)
>
> Just so you can see the generated code? I think Sam's tool is on the
> right track. Otherwise, I'm not sure what you're suggesting.
>
>
>> > I'm happy that the compiler is doing its job, but inlining
>> > optimizations are fragile, so I'm a little reluctant to rely on
>> > them.  Sad but true: the macro approach feels more comfortable for
>> > now.
>>
>> If by "macro approach" you refer to what I was talking about, then
>> isn't that needed to have versions with inlined comparators?
>
> Every time I look into changing `map' or other functions into macros, I
> conclude that it would be better to implement cross-module inlining.
>
>
> At Fri, 22 Apr 2011 19:27:49 -0500, Robby Findler wrote:
>> What are the non-JIT platforms nowadays?
>
> Sparc and ARM, probably. I expect Sparc to fade away, and I imagine
> adding ARM support to the JIT, eventually.
>
> Currently, the Racket implementation of `assq', etc., is used always.
> I've thought about ways to fall back to the C implementation when the
> JIT is unavailable, but I haven't settled on anything yet.
>
>
>
> Below are revised results for the performance test. The old
> C-implemented `assq' is now `kernel:assq', and the Racket-implemented
> `assq2' is now `assq'.
>
> The improvements are from adjusting the Racket code slightly and
> improving the JIT in various small ways. There's more room for
> improvement, obviously, and I stopped only because I've had enough for
> now.
>
> For those who are *really* interested, I've included annotated,
> disassembled JIT output at the end of this message. The JIT-generated
> core loop and the gcc-generated core loop are very similar; there's
> four times as much JIT-generated code because it unrolls the loop to
> four copies. (Both versions test two elements of the list in each
> iteration, because the "turtle" to detect cycles is incremented only with
> the second test.)
>
> ----------------------------------------
> 32-bit run:
>
> 'kern:assq
> cpu time: 23 real time: 23 gc time: 0
> cpu time: 38 real time: 39 gc time: 0
> cpu time: 39 real time: 38 gc time: 0
> 'assq
> cpu time: 38 real time: 39 gc time: 0
> cpu time: 60 real time: 61 gc time: 0
> cpu time: 61 real time: 61 gc time: 0
>
> 'kern:assoc
> cpu time: 60 real time: 60 gc time: 0
> cpu time: 563 real time: 567 gc time: 0
> cpu time: 554 real time: 555 gc time: 0
> 'assoc
> cpu time: 74 real time: 74 gc time: 0
> cpu time: 589 real time: 592 gc time: 0
> cpu time: 582 real time: 584 gc time: 0
>
> ----------------------------------------
> 64-bit run:
>
> 'kern:assq
> cpu time: 36 real time: 36 gc time: 0
> cpu time: 117 real time: 117 gc time: 0
> cpu time: 118 real time: 118 gc time: 0
> 'assq
> cpu time: 39 real time: 39 gc time: 0
> cpu time: 66 real time: 65 gc time: 0
> cpu time: 65 real time: 65 gc time: 0
>
> 'kern:assoc
> cpu time: 81 real time: 81 gc time: 0
> cpu time: 860 real time: 860 gc time: 0
> cpu time: 798 real time: 798 gc time: 0
> 'assoc
> cpu time: 83 real time: 83 gc time: 0
> cpu time: 782 real time: 789 gc time: 0
> cpu time: 738 real time: 739 gc time: 0
>
>
> ----------------------------------------
>
> The main `assq' loop.
> Arguments: list, turtle; environment: value to find, original list.
>
> 0x0676707a:     mov    0xc(%ebx),%eax [get list off stack]
> 0x0676707d:     test   $0x1,%al
> 0x06767080:     jne    0x676756b      [to fail: fixnum instead of list]
> 0x06767086:     movswl (%eax),%ecx
> 0x06767089:     cmp    $0x34,%ecx
> 0x0676708f:     jne    0x676756b      [to fail: not a pair]
> 0x06767095:     mov    0x4(%eax),%eax [car => item]
> 0x06767098:     mov    %eax,-0x4(%ebx) [save item on stack]
> 0x0676709b:     add    $0xfffffffc,%ebx [adjust stack pointer]
> 0x0676709e:     test   $0x1,%al
> 0x067670a1:     jne    0x676752a      [to fail: item is fixnum]
> 0x067670a7:     movswl (%eax),%ecx
> 0x067670aa:     cmp    $0x34,%ecx
> 0x067670b0:     jne    0x676752a      [to fail: item is not a pair]
> 0x067670b6:     mov    0x4(%eax),%eax [car of item]
> 0x067670b9:     mov    %eax,%ecx
> 0x067670bb:     mov    0x8(%ebx),%eax [get key to find off stack]
> 0x067670be:     cmp    %ecx,%eax      [found key?]
> 0x067670c0:     jne    0x67670cd        [jump if not...]
> 0x067670c6:     mov    (%ebx),%eax    [yes... get item from stack]
> 0x067670c8:     jmp    0x6767525        [and return item]
> 0x067670cd:     mov    0x10(%ebx),%eax [no... get list from stack]
> 0x067670d0:     mov    0x8(%eax),%eax  [cdr of list]
> 0x067670d3:     mov    %eax,-0x4(%ebx) [save new list to stack]
> 0x067670d6:     add    $0xfffffffc,%ebx [adjust stack pointer]
> 0x067670d9:     mov    0x18(%ebx),%ecx [get turtle from stack]
> 0x067670dc:     cmp    %ecx,%eax       [(eq? turtle list)]
> 0x067670de:     jne    0x6767111         [normally jumps...]
> 0x067670e4:     mov    $0x67675c0,%eax [fail: found cycle...]
> 0x067670e9:     mov    (%eax),%eax       [... call error function ...]
> 0x067670eb:     mov    %eax,-0xc(%ebx)
> 0x067670ee:     mov    $0x67675c4,%eax
> 0x067670f3:     mov    (%eax),%eax
> 0x067670f5:     mov    %eax,-0x8(%ebx)
> 0x067670f8:     mov    0x10(%ebx),%eax
> 0x067670fb:     mov    %eax,-0x4(%ebx)
> 0x067670fe:     mov    $0x34540,%esi
> 0x06767103:     add    $0xfffffff4,%ebx
> 0x06767106:     mov    %ebx,0x44c(%edi)
> 0x0676710c:     jmp    0x376c90
> 0x06767111:     mov    (%ebx),%eax      [get list from stack... (redundant!)]
> 0x06767113:     test   $0x1,%al          [... starts another iteration of above]
> 0x06767116:     jne    0x67674e0         [... and there are 4 copies total]
> 0x0676711c:     movswl (%eax),%ecx
> 0x0676711f:     cmp    $0x34,%ecx
> 0x06767125:     jne    0x67674e0
> 0x0676712b:     mov    0x4(%eax),%eax
> 0x0676712e:     mov    %eax,-0x4(%ebx)
> 0x06767131:     add    $0xfffffffc,%ebx
> 0x06767134:     test   $0x1,%al
> 0x06767137:     jne    0x676749f
> 0x0676713d:     movswl (%eax),%ecx
> 0x06767140:     cmp    $0x34,%ecx
> 0x06767146:     jne    0x676749f
> 0x0676714c:     mov    0x4(%eax),%eax
> 0x0676714f:     mov    %eax,%ecx
> 0x06767151:     mov    0x10(%ebx),%eax
> 0x06767154:     cmp    %ecx,%eax
> 0x06767156:     jne    0x6767163
> 0x0676715c:     mov    (%ebx),%eax
> 0x0676715e:     jmp    0x676749a
> 0x06767163:     mov    0x1c(%ebx),%eax
> 0x06767166:     mov    0x8(%eax),%eax
> 0x06767169:     mov    %eax,-0x4(%ebx)
> 0x0676716c:     mov    0x4(%ebx),%eax
> 0x0676716f:     mov    0x8(%eax),%eax
> 0x06767172:     mov    %eax,-0x8(%ebx)
> 0x06767175:     add    $0xfffffff8,%ebx
> 0x06767178:     mov    0x4(%ebx),%ecx
> 0x0676717b:     cmp    %ecx,%eax
> 0x0676717d:     jne    0x67671b0
> 0x06767183:     mov    $0x67675c8,%eax
> 0x06767188:     mov    (%eax),%eax
> 0x0676718a:     mov    %eax,-0xc(%ebx)
> 0x0676718d:     mov    $0x67675cc,%eax
> 0x06767192:     mov    (%eax),%eax
> 0x06767194:     mov    %eax,-0x8(%ebx)
> 0x06767197:     mov    0x1c(%ebx),%eax
> 0x0676719a:     mov    %eax,-0x4(%ebx)
> 0x0676719d:     mov    $0x34540,%esi
> 0x067671a2:     add    $0xfffffff4,%ebx
> 0x067671a5:     mov    %ebx,0x44c(%edi)
> 0x067671ab:     jmp    0x376c90
> 0x067671b0:     mov    (%ebx),%eax       [get list from stack... (redundant!)]
> 0x067671b2:     test   $0x1,%al
> 0x067671b5:     jne    0x6767455
> 0x067671bb:     movswl (%eax),%ecx
> 0x067671be:     cmp    $0x34,%ecx
> 0x067671c4:     jne    0x6767455
> 0x067671ca:     mov    0x4(%eax),%eax
> 0x067671cd:     mov    %eax,-0x4(%ebx)
> 0x067671d0:     add    $0xfffffffc,%ebx
> 0x067671d3:     test   $0x1,%al
> 0x067671d6:     jne    0x6767414
> 0x067671dc:     movswl (%eax),%ecx
> 0x067671df:     cmp    $0x34,%ecx
> 0x067671e5:     jne    0x6767414
> 0x067671eb:     mov    0x4(%eax),%eax
> 0x067671ee:     mov    %eax,%ecx
> 0x067671f0:     mov    0x1c(%ebx),%eax
> 0x067671f3:     cmp    %ecx,%eax
> 0x067671f5:     jne    0x6767202
> 0x067671fb:     mov    (%ebx),%eax
> 0x067671fd:     jmp    0x676740f
> 0x06767202:     mov    0x4(%ebx),%eax
> 0x06767205:     mov    0x8(%eax),%eax
> 0x06767208:     mov    %eax,-0x4(%ebx)
> 0x0676720b:     add    $0xfffffffc,%ebx
> 0x0676720e:     mov    0xc(%ebx),%ecx
> 0x06767211:     cmp    %ecx,%eax
> 0x06767213:     jne    0x6767246
> 0x06767219:     mov    $0x67675d0,%eax
> 0x0676721e:     mov    (%eax),%eax
> 0x06767220:     mov    %eax,-0xc(%ebx)
> 0x06767223:     mov    $0x67675d4,%eax
> 0x06767228:     mov    (%eax),%eax
> 0x0676722a:     mov    %eax,-0x8(%ebx)
> 0x0676722d:     mov    0x24(%ebx),%eax
> 0x06767230:     mov    %eax,-0x4(%ebx)
> 0x06767233:     mov    $0x34540,%esi
> 0x06767238:     add    $0xfffffff4,%ebx
> 0x0676723b:     mov    %ebx,0x44c(%edi)
> 0x06767241:     jmp    0x376c90
> 0x06767246:     mov    (%ebx),%eax      [get list from stack... (redundant!)]
> 0x06767248:     test   $0x1,%al
> 0x0676724b:     jne    0x67673ca
> 0x06767251:     movswl (%eax),%ecx
> 0x06767254:     cmp    $0x34,%ecx
> 0x0676725a:     jne    0x67673ca
> 0x06767260:     mov    0x4(%eax),%eax
> 0x06767263:     mov    %eax,-0x4(%ebx)
> 0x06767266:     add    $0xfffffffc,%ebx
> 0x06767269:     test   $0x1,%al
> 0x0676726c:     jne    0x6767389
> 0x06767272:     movswl (%eax),%ecx
> 0x06767275:     cmp    $0x34,%ecx
> 0x0676727b:     jne    0x6767389
> 0x06767281:     mov    0x4(%eax),%eax
> 0x06767284:     mov    %eax,%ecx
> 0x06767286:     mov    0x24(%ebx),%eax
> 0x06767289:     cmp    %ecx,%eax
> 0x0676728b:     jne    0x6767298
> 0x06767291:     mov    (%ebx),%eax
> 0x06767293:     jmp    0x6767384
> 0x06767298:     mov    0x10(%ebx),%eax
> 0x0676729b:     mov    0x8(%eax),%eax
> 0x0676729e:     mov    %eax,-0x4(%ebx)
> 0x067672a1:     mov    0x4(%ebx),%eax
> 0x067672a4:     mov    0x8(%eax),%eax
> 0x067672a7:     mov    %eax,-0x8(%ebx)
> 0x067672aa:     add    $0xfffffff8,%ebx
> 0x067672ad:     mov    0x4(%ebx),%ecx
> 0x067672b0:     cmp    %ecx,%eax
> 0x067672b2:     jne    0x67672e5
> 0x067672b8:     mov    $0x67675d8,%eax
> 0x067672bd:     mov    (%eax),%eax
> 0x067672bf:     mov    %eax,-0xc(%ebx)
> 0x067672c2:     mov    $0x67675dc,%eax
> 0x067672c7:     mov    (%eax),%eax
> 0x067672c9:     mov    %eax,-0x8(%ebx)
> 0x067672cc:     mov    0x30(%ebx),%eax
> 0x067672cf:     mov    %eax,-0x4(%ebx)
> 0x067672d2:     mov    $0x34540,%esi
> 0x067672d7:     add    $0xfffffff4,%ebx
> 0x067672da:     mov    %ebx,0x44c(%edi)
> 0x067672e0:     jmp    0x376c90
> 0x067672e5:     mov    (%ebx),%eax      [get list from stack... (redundant!)]
> 0x067672e7:     mov    %eax,-0x8(%ebx)  [... save it(!) to prep recursive call]
> 0x067672ea:     mov    0x4(%ebx),%eax   [get turtle from stack]
> 0x067672ed:     add    $0xfffffff8,%ebx [adjust stack pointer]
> 0x067672f0:     lea    0x64(%edi),%edx  [check whether other threads needs to run...]
> 0x067672f3:     mov    (%edx),%edx
> 0x067672f5:     cmp    $0x0,%edx
> 0x067672fb:     jle    0x6767312        [to slow: for thread swap]
> 0x067672fd:     mov    -0x1c(%ebp),%edx [argv = end of orig argv (tail call)]
> 0x06767300:     add    $0xffffffec,%edx  [... minus orig argc for args in-place]
> 0x06767303:     mov    %eax,0x10(%edx)  [set turtle argument for recur]
> 0x06767306:     mov    (%ebx),%eax      [load list back]
> 0x06767308:     mov    %eax,0xc(%edx)   [... and set list argument for recur]
> 0x0676730b:     mov    %edx,%ebx        [set Racket stack to argv]
> 0x0676730d:     jmp    0x676707a        [recur]
>
> ----------------------------------------
> For comparsion, the gcc-compiled `assq' function's loop:
>
> 0x008c6480 <assq+112>:  mov    -0x20(%ebp),%edx   [get l off stack]
> 0x008c6483 <assq+115>:  test   $0x1,%dl
> 0x008c6486 <assq+118>:  jne    0x8c6523 <assq+275> [to fail: l is fixnum]
> 0x008c648c <assq+124>:  cmpw   $0x34,(%edx)
> 0x008c6490 <assq+128>:  jne    0x8c6523 <assq+275> [to fail: not a pair]
> 0x008c6496 <assq+134>:  mov    0x4(%edx),%ecx      [item = car of l]
> 0x008c6499 <assq+137>:  mov    %ecx,-0x1c(%ebp)    [save item on stack]
> 0x008c649c <assq+140>:  test   $0x1,%cl
> 0x008c649f <assq+143>:  jne    0x8c6588 <assq+376> [to fail: item is fixnum]
> 0x008c64a5 <assq+149>:  cmpw   $0x34,(%ecx)
> 0x008c64a9 <assq+153>:  jne    0x8c6588 <assq+376> [to fail: item is not pair]
> 0x008c64af <assq+159>:  mov    0xc(%ebp),%esi      [get x arg off stack]
> 0x008c64b2 <assq+162>:  mov    (%esi),%eax         [found x?]
> 0x008c64b4 <assq+164>:  cmp    0x4(%ecx),%eax
> 0x008c64b7 <assq+167>:  je     0x8c6648 <scheme_get_thread_local_variables>
> 0x008c64bd <assq+173>:  mov    0x8(%edx),%edx      [no, so get cdr of l]
> 0x008c64c0 <assq+176>:  mov    %edx,-0x20(%ebp)
> 0x008c64c3 <assq+179>:  test   $0x1,%dl
> 0x008c64c6 <assq+182>:  jne    0x8c6480 <assq+112>  [to fail: cdr is fixnum]
> 0x008c64c8 <assq+184>:  cmpw   $0x34,(%edx)
> 0x008c64cc <assq+188>:  jne    0x8c6480 <assq+112> [to fail: cdr is not a pair]
> 0x008c64ce <assq+190>:  mov    0x4(%edx),%ecx      [item = car l]
> 0x008c64d1 <assq+193>:  mov    %ecx,-0x1c(%ebp)
> 0x008c64d4 <assq+196>:  test   $0x1,%cl
> 0x008c64d7 <assq+199>:  jne    0x8c6480 <assq+112> [fail: item is a fixnum]
> 0x008c64d9 <assq+201>:  cmpw   $0x34,(%ecx)
> 0x008c64dd <assq+205>:  jne    0x8c6480 <assq+112> [fail: item is not a pair]
> 0x008c64df <assq+207>:  mov    (%esi),%eax         [get x off stack]
> 0x008c64e1 <assq+209>:  cmp    0x4(%ecx),%eax      [car of item == x ?]
> 0x008c64e4 <assq+212>:  je     0x8c6648 <scheme_get_thread_local_variables>
> 0x008c64ea <assq+218>:  mov    0x8(%edx),%eax
> 0x008c64ed <assq+221>:  mov    %eax,-0x20(%ebp)
> 0x008c64f0 <assq+224>:  mov    -0x24(%ebp),%edx  [get turtle off stack]
> 0x008c64f3 <assq+227>:  cmp    %edx,%eax         [is l == turtle?]
> 0x008c64f5 <assq+229>:  je     0x8c6523 <assq+275> [to fail: cycle]
> 0x008c64f7 <assq+231>:  mov    0x8(%edx),%eax    [cdr of turtle]
> 0x008c64fa <assq+234>:  mov    %eax,-0x24(%ebp)  [save new turtle to stack]
> 0x008c64fd <scheme_get_thread_local_variables+0>:       mov    (%edi),%eax
> 0x008c64ff <scheme_get_thread_local_variables+2>:       mov    %gs:0x48(,%eax,4),%eax
> 0x008c6507 <assq+247>:  mov    0x68(%eax),%eax
> 0x008c650a <assq+250>:  test   %eax,%eax           [other threads need to run?]
> 0x008c650c <assq+252>:  jg     0x8c6480 <assq+112> [to slow: other threads run]
> 0x008c6512 <assq+258>:  movl   $0x4,-0x50(%ebp)
> 0x008c6519 <assq+265>:  call   0x9e44a0 <scheme_out_of_fuel>
> 0x008c651e <assq+270>:  jmp    0x8c6480 <assq+112>  [recur]
>
> ----------------------------------------
>
> The `assq' wrapper that jumps to the above loop. The argument
> juggling here --- turning the arg sequence `x l' to `l l l x'
> --- may be part of the slowdown.
>
> 0x0037e41c:     cmp    $0x2,%ecx        [arity check]
> 0x0037e422:     jne    0x372e80
> 0x0037e428:     mov    %eax,-0x4(%ebx)  [save closure (but no GC possible!)]
> 0x0037e42b:     add    $0xfffffffc,%ebx
> 0x0037e42e:     mov    0x4(%ebx),%eax   [get x arg]
> 0x0037e431:     mov    %eax,-0x10(%ebx) [prep x arg to `loop']
> 0x0037e434:     mov    0x8(%ebx),%eax   [get l arg]
> 0x0037e437:     mov    %eax,-0xc(%ebx)  [prep l arg to `loop']
> 0x0037e43a:     mov    %eax,-0x8(%ebx)  [prep orig-l arg to `loop']
> 0x0037e43d:     mov    %eax,-0x4(%ebx)  [prep turtle arg to `loop']
> 0x0037e440:     mov    $0x37e47c,%esi
> 0x0037e445:     mov    (%esi),%esi      [esi now has `loop' closure pointer]
> 0x0037e447:     add    $0xfffffff0,%ebx [update Racket stack register]
> 0x0037e44a:     mov    -0x1c(%ebp),%edx [get Racket stack base for tail calls]
> 0x0037e44d:     add    $0xfffffff0,%edx [4 argument => subtract 16]
> 0x0037e450:     mov    0xc(%ebx),%ecx   [move arg1 into place...]
> 0x0037e453:     mov    %ecx,0xc(%edx)
> 0x0037e456:     mov    0x8(%ebx),%ecx
> 0x0037e459:     mov    %ecx,0x8(%edx)
> 0x0037e45c:     mov    0x4(%ebx),%ecx
> 0x0037e45f:     mov    %ecx,0x4(%edx)
> 0x0037e462:     mov    (%ebx),%ecx
> 0x0037e464:     mov    %ecx,(%edx)
> 0x0037e466:     mov    %edx,%ebx        [runstack = argv]
> 0x0037e468:     mov    %esi,%eax
> 0x0037e46a:     mov    $0x4,%ecx        [argc = 4]
> 0x0037e46f:     mov    %ebx,%edx        [argv = runstack (redundant!)]
> 0x0037e471:     jmp    0x0676707a
>
> ----------------------------------------
> Then again, here's a comparable section of the gcc-compiled `assq'
> function's preamble that cooperates with the GC and sets up the loop.
>
> 0x008c641f <scheme_get_thread_local_variables+0>:       mov    0x176c06(%ebx),%edi
> 0x008c6425 <scheme_get_thread_local_variables+6>:       mov    (%edi),%eax
> 0x008c6427 <scheme_get_thread_local_variables+8>:       mov    %gs:0x48(,%eax,4),%edx
> 0x008c642f <assq+31>:   mov    0x4(%edx),%edx
> 0x008c6432 <assq+34>:   mov    %edx,-0x54(%ebp)
> 0x008c6435 <scheme_get_thread_local_variables+0>:       mov    %gs:0x48(,%eax,4),%eax
> 0x008c643d <assq+45>:   lea    -0x54(%ebp),%edx
> 0x008c6440 <assq+48>:   mov    %edx,0x4(%eax)
> 0x008c6443 <assq+51>:   lea    0xc(%ebp),%eax
> 0x008c6446 <assq+54>:   mov    %eax,-0x4c(%ebp)
> 0x008c6449 <assq+57>:   lea    -0x1c(%ebp),%eax
> 0x008c644c <assq+60>:   mov    %eax,-0x48(%ebp)
> 0x008c644f <assq+63>:   lea    -0x24(%ebp),%eax
> 0x008c6452 <assq+66>:   mov    %eax,-0x44(%ebp)
> 0x008c6455 <assq+69>:   lea    -0x20(%ebp),%eax
> 0x008c6458 <assq+72>:   mov    %eax,-0x40(%ebp)
> 0x008c645b <assq+75>:   movl   $0x0,-0x1c(%ebp)
> 0x008c6462 <assq+82>:   movl   $0x0,-0x20(%ebp)
> 0x008c6469 <assq+89>:   movl   $0x0,-0x24(%ebp)
> 0x008c6470 <assq+96>:   mov    0xc(%ebp),%eax
> 0x008c6473 <assq+99>:   mov    0x4(%eax),%eax
> 0x008c6476 <assq+102>:  mov    %eax,-0x24(%ebp)
> 0x008c6479 <assq+105>:  mov    %eax,-0x20(%ebp)
> 0x008c647c <assq+108>:  nopl   0x0(%eax)
>
> Note the NOP, which aligns the gcc-generated loop start. The JIT
> doesn't try to align loop starts, because past experiments suggested
> that it doesn't matter. That's worth another look.
>
> _________________________________________________
>  For list-related administrative tasks:
>  http://lists.racket-lang.org/listinfo/dev
>



Posted on the dev mailing list.