[plt-scheme] tree-recursion vs explicit stack tail-recursive

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Fri Apr 23 11:29:00 EDT 2010

At Thu, 22 Apr 2010 20:28:55 -0400, "Skeptic ." wrote:
> I'm curious to know how in PLT what kind of performance difference
> (in the broad sense) would be found when comparing a tree-recursive
> procedure to a tail-recursive one that use an explicit stack and a
> non-trivial nested list, e.g. flatten, count-atoms and the like.

It's generally better to write in direct, tree-recursive style.

For example, many systems implement `map' in a tail-recursive way to
avoid deep recursion:

 (define (map1/loop f l)
   (let loop ([l l][accum null])
     (if (null? l)
         (reverse accum)
         (loop (cdr l) (cons (f (car l)) accum)))))

The single-list `map' in PLT Scheme, however, is implemented in direct
style:

 (define (map1 f l)
   (if (null? l)
       null
       (cons (f (car l)) (map1 f (cdr l)))))

The direct style usually performs better in PLT Scheme, and we don't
have to worry about a special limit on continuation sizes (i.e., no
stack overflow).

The direct style performs better in the `map' case because the GC can
process continuation frames more efficiently than `cons' chains. That
isn't typical across implementations; the GC in many Scheme systems is
especially highly tuned for pairs, and less so for continuation frames.


Of course, if you want to just count the elements of a list, then a
loop version

  (define (count/tail l)
    (let loop ([l l][accum 0])
      (if (null? l)
          accum
          (loop (cdr l) (add1 accum)))))

is 10 times faster than a direct version

  (define (count l)
    (if (null? l)
        0
        (+ 1 (count (cdr l)))))

The difference is that the number result can be accumulated
left-to-right and doesn't have to be "reversed" at the end. For
performance, not having a continuation at all is better than having
one.


Those examples were not tree recursion, but they point to a general
answer: If your computation has a continuation, let the runtime system
represent it. Your program will be easier to write and read in direct
style, and it will probably perform as well (if not better) as with an
explicit representation.



Posted on the users mailing list.