<div class="gmail_quote">On Fri, Apr 23, 2010 at 17:29, Matthew Flatt <<a href="mailto:mflatt@cs.utah.edu">mflatt@cs.utah.edu</a>> wrote: <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"> <div class="im">At Thu, 22 Apr 2010 20:28:55 -0400, "Skeptic ." wrote: > I'm curious to know how in PLT what kind of performance difference > (in the broad sense) would be found when comparing a tree-recursive > procedure to a tail-recursive one that use an explicit stack and a > non-trivial nested list, e.g. flatten, count-atoms and the like. </div>It's generally better to write in direct, tree-recursive style. For example, many systems implement `map' in a tail-recursive way to avoid deep recursion: (define (map1/loop f l) (let loop ([l l][accum null]) (if (null? l) (reverse accum) (loop (cdr l) (cons (f (car l)) accum))))) The single-list `map' in PLT Scheme, however, is implemented in direct style: (define (map1 f l) (if (null? l) null (cons (f (car l)) (map1 f (cdr l))))) The direct style usually performs better in PLT Scheme, and we don't have to worry about a special limit on continuation sizes (i.e., no stack overflow). The direct style performs better in the `map' case because the GC can process continuation frames more efficiently than `cons' chains. That isn't typical across implementations; the GC in many Scheme systems is especially highly tuned for pairs, and less so for continuation frames. Of course, if you want to just count the elements of a list, then a loop version (define (count/tail l) (let loop ([l l][accum 0]) (if (null? l) accum (loop (cdr l) (add1 accum))))) is 10 times faster than a direct version (define (count l) (if (null? l) 0 (+ 1 (count (cdr l))))) The difference is that the number result can be accumulated left-to-right and doesn't have to be "reversed" at the end. For performance, not having a continuation at all is better than having one. Those examples were not tree recursion, but they point to a general answer: If your computation has a continuation, let the runtime system represent it. Your program will be easier to write and read in direct style, and it will probably perform as well (if not better) as with an explicit representation. <div><div></div> </div></blockquote></div> While programming a cellular automaton, I first used `foldl' to do some computation on the list of neighbors. That was awfully slow, so I tried with a hand-written loop, and it became much faster -- fast enough for fluid visual effect. I'm not sure exactly why `foldl' was so slow though...