[plt-dev] coding

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Fri May 14 09:43:18 EDT 2010

As I continued my contracting of racket/list last night, I encountered  
a strange function:

(define (split-at-right.v1 list n)
   (let loop ([list list]
              [lead (or (drop* list n) (err2large 'split-at-right list  
n))]
              [pfx '()])
     ;; could throw an error for non-lists, but be more like `split-at'
     (if (pair? lead)
         (loop (cdr list) (cdr lead) (cons (car list) pfx))
         (values (reverse pfx) list))))

;; internal use below
(define (drop* list n) ; no error checking, returns #f if index is too  
large
   (if (zero? n) list (and (pair? list) (drop* (cdr list) (sub1 n)))))

Try to figure out what it does before you read on. Don't look up the  
docs.
(err2large just raises an error.)

It took me 5 minutes to figure out why this worked. I consider this  
too much.

;; ---

So I rewrote in what I consider the obvious style:

(define (split-at-right l n)
   (let down ([l l][seen '()])
     (if (pair? l)
         (down (cdr l) (cons (car l) seen))
         (let up ([l seen][n n][seen l])
           (if (> n 0)
               (if (pair? l)
                   (up (cdr l) (- n 1) (cons (car l) seen))
                   (err2large 'split-at-right list n))
               (values (reverse l) seen))))))

Yes, it's an extra line of code but I think it's truly straightforward:
recur down and reverse, recur backwards.

;; ---

Then I had doubts. Perhaps the original designer who baked this strange
invariant into the two loops knew something about the internal  
performance
of racket that I don't know. So here is a microbenchmark stress test:

(define (stress-test split)
   (define n 20000)
   (define l (build-list n add1))
   (define _ (begin (collect-garbage) (collect-garbage)))
   (define x
     (time
      (for/list ((i (in-range n)))
        (define-values (x y) (split-at-right l i))
        (length y))))
   (void))

(stress-test split-at-right)
(stress-test split-at-right.v1)

Here are the times:

> cpu time: 19681 real time: 19672 gc time: 6942
> cpu time: 19716 real time: 19792 gc time: 6966

No difference; if any, my version is a tad faster (on smaller
lists too).

;; ---

Structural design with straightforward accumulators is fine, just fine.
No need to encode invariants -- Matthias




Posted on the dev mailing list.