[plt-scheme] tree-recursion vs explicit stack tail-recursive

From: Laurent (laurent.orseau at gmail.com)
Date: Fri Apr 23 13:48:29 EDT 2010

On Fri, Apr 23, 2010 at 17:29, Matthew Flatt <mflatt at cs.utah.edu> wrote:

> At Thu, 22 Apr 2010 20:28:55 -0400, "Skeptic ." wrote:
> > I'm curious to know how in PLT what kind of performance difference
> > (in the broad sense) would be found when comparing a tree-recursive
> > procedure to a tail-recursive one that use an explicit stack and a
> > non-trivial nested list, e.g. flatten, count-atoms and the like.
> It's generally better to write in direct, tree-recursive style.
> For example, many systems implement `map' in a tail-recursive way to
> avoid deep recursion:
>  (define (map1/loop f l)
>   (let loop ([l l][accum null])
>     (if (null? l)
>         (reverse accum)
>         (loop (cdr l) (cons (f (car l)) accum)))))
> The single-list `map' in PLT Scheme, however, is implemented in direct
> style:
>  (define (map1 f l)
>   (if (null? l)
>       null
>       (cons (f (car l)) (map1 f (cdr l)))))
> The direct style usually performs better in PLT Scheme, and we don't
> have to worry about a special limit on continuation sizes (i.e., no
> stack overflow).
> The direct style performs better in the `map' case because the GC can
> process continuation frames more efficiently than `cons' chains. That
> isn't typical across implementations; the GC in many Scheme systems is
> especially highly tuned for pairs, and less so for continuation frames.
> Of course, if you want to just count the elements of a list, then a
> loop version
>  (define (count/tail l)
>    (let loop ([l l][accum 0])
>      (if (null? l)
>          accum
>          (loop (cdr l) (add1 accum)))))
> is 10 times faster than a direct version
>  (define (count l)
>    (if (null? l)
>        0
>        (+ 1 (count (cdr l)))))
> The difference is that the number result can be accumulated
> left-to-right and doesn't have to be "reversed" at the end. For
> performance, not having a continuation at all is better than having
> one.
> Those examples were not tree recursion, but they point to a general
> answer: If your computation has a continuation, let the runtime system
> represent it. Your program will be easier to write and read in direct
> style, and it will probably perform as well (if not better) as with an
> explicit representation.
While programming a cellular automaton, I first used `foldl' to do some
computation on the list of neighbors.
That was awfully slow, so I tried with a hand-written loop, and it became
much faster -- fast enough for fluid visual effect.
I'm not sure exactly why `foldl' was so slow though...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20100423/644f3dbb/attachment.html>

Posted on the users mailing list.