[racket] simple scripting with Racket

From: Greg Hendershott (greghendershott at gmail.com)
Date: Sat Jul 20 13:29:08 EDT 2013

I started to write something, got side-tracked, finished it and came
to post ... but I see Matthias already replied.

FWIW, here's what I came up with. One main point is use `fold-files`
re your concern about directories vs. files:

(define (on-fold-file path type cum)
  (match* (type (path->string path))
    [('file (and (not (pregexp "-ms\\.csv"))
                 (pregexp "\\.csv$")))
     (with-input-from-file path
       (thunk (with-output-to-file (path-replace-suffix path "-ms.csv")
                (thunk (copy-port (current-input-port) ;; Example: just copy
                                  (current-output-port)))
                #:exists 'replace)))
     (void)]
    [(_ _) (void)]))

(fold-files on-fold-file '() "/tmp/test")

[My regexp foo failed me there. I don't think what you had --
"[^-ms]\\.csv" -- is quite right to exclude "foo-ms.csv"; it would
also exclude e.g. "s.csv"? But "(?!-ms)\\.csv" did not work. So I
resorted to two pregexps.]

If you wanted to generalize the above just a little, so that the work
per file can be specified independently rather than hardcoded into
on-fold-file, you could do something like this:

(define (on-fold-file xform path type cum)
  (match* (type (path->string path))
    [('file (and (not (pregexp "-ms\\.csv"))
                 (pregexp "\\.csv$")))
     (with-input-from-file path
       (thunk (with-output-to-file (path-replace-suffix path "-ms.csv")
                xform
                #:exists 'replace)))
     (void)]
    [(_ _) (void)]))

(define (my-xform)
  ;; Example: just copy
  (copy-port (current-input-port)
             (current-output-port)))

(fold-files (curry on-fold-file my-xform) '() "/tmp/test")

To be clear, I'm not saying what I came up with is better than your
original or Matthias'. Just another way to do it.


On Sat, Jul 20, 2013 at 11:14 AM, Matthias Felleisen
<matthias at ccs.neu.edu> wrote:
>
> Here is a slightly more compact version, still not scsh syntax:
>
> #! /bin/sh
> #|
> exec racket -tm "$0" ${1+"$@"}
> |#
> #lang racket
>
> ;;
> ---------------------------------------------------------------------------------------------------
> ;; this script replaces "," in "stem.csv" files with "." and places the
> result in "stem-ms.csv"
> ;; current directory only
>
> (provide main)
>
> (define (main)
>   ;; use (in-directory), if you would like recursion in file system
>   (for ((f (directory-list)) #:when (regexp-match? #rx"[^-ms]\\.csv" f))
>     (define in (open-input-file f))
>     (define ff (regexp-replace #rx"\\." (path->string f) "-ms."))
>     (define out (open-output-file ff #:exists 'replace #:mode 'text))
>     (parameterize ((current-input-port in) (current-output-port out))
>       (for ((l (in-lines)))
> (printf "~a\n" (regexp-replace #rx"\\." l ","))))
>     ;; skip, if you don't expect too many files:
>     (close-input-port in)
>     (close-output-port out)))
>
> ;; a test run:
>
> s% !!
> rm -rf foo-ms.csv ; cat foo.csv ; ./replace.rkt ; cat foo-ms.csv
> 1, "hello . world", 10
> 1, "hello , world", 10
>
> Does this help a bit? -- Matthias
>
>
>
>
>
>
> On Jul 20, 2013, at 7:04 AM, Vlad Kozin wrote:
>
> Hi. I thought I'd use Racket to solve a silly little scripting problem.
>
> Script doesn't take arguments, instead looks for .csv files in its
> directory, reads them line by line, does some trivial replacement and writes
> the result to a new .csv file adding "-ms" suffix to its name. You can see
> my solution http://pastebin.com/CPb5gxEu and below. I suppose you bearded
> hackers would roll-out a quick awk + sh script in a minute - seems like
> exactly the task suited for awk. Unfortunately I find I'm allergic to sh -
> entirely too much of a challenge for me. Plus .csv files I'm dealing with
> are on the order of >200Mb.
>
> Writing it felt somewhat awkward or rather more so than I anticipated, hence
> this post. Can you see a better, more straightforward, canonical way of
> doing this sort of things? Feels like general-purpousness of the language
> shows and a dsl like scsh would allow more straightforward style.
>
> #! /usr/bin/env racket
> #lang racket
>
> ;; doesn't tell directories and files apart, can be dangerous
> (define files
>   (filter
>    (lambda (e) (regexp-match? #rx"[^-ms]\\.csv" e))
>    (directory-list (current-directory))))
>
> ;; this really should be a straightforward pipe with processing in the
> middle
> ;; this wrapping of with--file procedures feels a little like writing cps by
> hand
> ;; (stdin stdout) now point to (in out)
> (define (with-in-out-files proc in out)
>   (with-input-from-file in
>     (lambda ()
>       (with-output-to-file out proc  #:mode 'text #:exists 'replace))))
>
> (define (replace-dot-in-file in)
>   (define (dot->comma)
>     (for ([l (in-lines)])
>       (printf "~a\n"   (regexp-replace #rx"\\." l ","))))
>   (define (out-filename file suffix)
>     (define file-ext  (regexp-split #rx"\\." file))
>     (string-append (first file-ext) suffix "." (second file-ext)))
>   (with-in-out-files dot->comma in (out-filename in "-ms")))
>
> ;; good candidate to exploit that multi-core bazooka
> ;; should I be using futures / places / threads and
> ;; some kind of parallel map here?
> (time (for-each replace-dot-in-file files))
>
> Thanks
>
>
>
> ____________________
>  Racket Users list:
>  http://lists.racket-lang.org/users
>
>
>
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users
>

Posted on the users mailing list.