[racket-dev] Speeding up `in-directory`

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Wed Sep 4 15:51:27 EDT 2013

At Wed, 4 Sep 2013 15:44:41 -0400, Sam Tobin-Hochstadt wrote:
> On Wed, Sep 4, 2013 at 3:23 PM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> > At Wed, 4 Sep 2013 15:13:31 -0400, Sam Tobin-Hochstadt wrote:
> >> On Wed, Sep 4, 2013 at 12:29 PM, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> >> >
> >> >                                  (directory-list
> >> >                                   (path->complete-path d init-dir)))])
> >>
> >>
> >> I'm pretty sure this is wrong, but I'm not sure how to fix it.  In
> >> particular, there's no reason that `init-dir` should have any relation
> >> to any of the paths being generated, and so using it here is wrong.
> >> You can break the code with
> >>
> >> (for ([i (in-directory6 d)])
> >>   (current-directory "/")
> >>   (displayln i))
> >>
> >> when run with a non-#f value of d.
> >
> > I think that if `d` is a relative path then it should be treated as
> > relative to the current directory at the time that `in-directory6` is
> > called:
> >
> >  * When I use "/home/mflatt/tmp" for `d`, I get a listing of files
> >    under "/home/mflatt/tmp".
> >
> >  * When I use "tmp" and my initial directory is "/home/mflatt", I still
> >    get a listing of files under "/home/mflatt/tmp".
> >
> > Both of those are as they should be, I think.
> >
> > When I use `in-directory` instead of `in-directory6`, then it behaves
> > in a way that I think is less useful and should count as a bug in
> > `in-directory`.
> 
> I was able to provoke the bad behavior as follows:
> 
> Create a racket program with `main` like this in ~/tmp/find.rkt:
> 
> (define d (if (= 0 (vector-length (current-command-line-arguments)))
>               #f
>               (vector-ref (current-command-line-arguments) 0)))
> 
> (for ([i (in-directory6 d)])
>   (current-directory "/")
>   (displayln i))
> 
> Then I did this:
> 
> % cd ~/tmp/foo
> % racket ~/tmp/find.rkt ~/tmp/bar
> .... correct output ....
> % racket ~/tmp/find.rkt ../bar
> .... much less output ...
> 
> Removing the `(current-directory "/")` line in the middle makes the
> two calls produce identical output.

Ah, I see. When I add a subdirectory in "bar" with files in the
subdirectory, then I see the problem.

I think this is more correct:

(define (in-directory6 [orig-dir #f])
  (define init-dir (current-directory))
  ;; current state of the sequence is a list of paths to produce; when
  ;; incrementing past a directory, add the directory's immediate
  ;; content to the front of the list:
  (define (next l)
    (define d (path->complete-path (car l) init-dir))
    (if (directory-exists? d)
        (append (for/list ([f (in-list (directory-list d))])
                  (build-path d f))
                (cdr l))
        (cdr l)))
  (make-do-sequence
   (lambda ()
     (values
      car
      next
      (if orig-dir
          (next (list orig-dir))
          (directory-list init-dir))
      pair?
      #f
      #f))))


Posted on the dev mailing list.