[racket] simple scripting with Racket
I started to write something, got side-tracked, finished it and came
to post ... but I see Matthias already replied.
FWIW, here's what I came up with. One main point is use `fold-files`
re your concern about directories vs. files:
(define (on-fold-file path type cum)
(match* (type (path->string path))
[('file (and (not (pregexp "-ms\\.csv"))
(pregexp "\\.csv$")))
(with-input-from-file path
(thunk (with-output-to-file (path-replace-suffix path "-ms.csv")
(thunk (copy-port (current-input-port) ;; Example: just copy
(current-output-port)))
#:exists 'replace)))
(void)]
[(_ _) (void)]))
(fold-files on-fold-file '() "/tmp/test")
[My regexp foo failed me there. I don't think what you had --
"[^-ms]\\.csv" -- is quite right to exclude "foo-ms.csv"; it would
also exclude e.g. "s.csv"? But "(?!-ms)\\.csv" did not work. So I
resorted to two pregexps.]
If you wanted to generalize the above just a little, so that the work
per file can be specified independently rather than hardcoded into
on-fold-file, you could do something like this:
(define (on-fold-file xform path type cum)
(match* (type (path->string path))
[('file (and (not (pregexp "-ms\\.csv"))
(pregexp "\\.csv$")))
(with-input-from-file path
(thunk (with-output-to-file (path-replace-suffix path "-ms.csv")
xform
#:exists 'replace)))
(void)]
[(_ _) (void)]))
(define (my-xform)
;; Example: just copy
(copy-port (current-input-port)
(current-output-port)))
(fold-files (curry on-fold-file my-xform) '() "/tmp/test")
To be clear, I'm not saying what I came up with is better than your
original or Matthias'. Just another way to do it.
On Sat, Jul 20, 2013 at 11:14 AM, Matthias Felleisen
<matthias at ccs.neu.edu> wrote:
>
> Here is a slightly more compact version, still not scsh syntax:
>
> #! /bin/sh
> #|
> exec racket -tm "$0" ${1+"$@"}
> |#
> #lang racket
>
> ;;
> ---------------------------------------------------------------------------------------------------
> ;; this script replaces "," in "stem.csv" files with "." and places the
> result in "stem-ms.csv"
> ;; current directory only
>
> (provide main)
>
> (define (main)
> ;; use (in-directory), if you would like recursion in file system
> (for ((f (directory-list)) #:when (regexp-match? #rx"[^-ms]\\.csv" f))
> (define in (open-input-file f))
> (define ff (regexp-replace #rx"\\." (path->string f) "-ms."))
> (define out (open-output-file ff #:exists 'replace #:mode 'text))
> (parameterize ((current-input-port in) (current-output-port out))
> (for ((l (in-lines)))
> (printf "~a\n" (regexp-replace #rx"\\." l ","))))
> ;; skip, if you don't expect too many files:
> (close-input-port in)
> (close-output-port out)))
>
> ;; a test run:
>
> s% !!
> rm -rf foo-ms.csv ; cat foo.csv ; ./replace.rkt ; cat foo-ms.csv
> 1, "hello . world", 10
> 1, "hello , world", 10
>
> Does this help a bit? -- Matthias
>
>
>
>
>
>
> On Jul 20, 2013, at 7:04 AM, Vlad Kozin wrote:
>
> Hi. I thought I'd use Racket to solve a silly little scripting problem.
>
> Script doesn't take arguments, instead looks for .csv files in its
> directory, reads them line by line, does some trivial replacement and writes
> the result to a new .csv file adding "-ms" suffix to its name. You can see
> my solution http://pastebin.com/CPb5gxEu and below. I suppose you bearded
> hackers would roll-out a quick awk + sh script in a minute - seems like
> exactly the task suited for awk. Unfortunately I find I'm allergic to sh -
> entirely too much of a challenge for me. Plus .csv files I'm dealing with
> are on the order of >200Mb.
>
> Writing it felt somewhat awkward or rather more so than I anticipated, hence
> this post. Can you see a better, more straightforward, canonical way of
> doing this sort of things? Feels like general-purpousness of the language
> shows and a dsl like scsh would allow more straightforward style.
>
> #! /usr/bin/env racket
> #lang racket
>
> ;; doesn't tell directories and files apart, can be dangerous
> (define files
> (filter
> (lambda (e) (regexp-match? #rx"[^-ms]\\.csv" e))
> (directory-list (current-directory))))
>
> ;; this really should be a straightforward pipe with processing in the
> middle
> ;; this wrapping of with--file procedures feels a little like writing cps by
> hand
> ;; (stdin stdout) now point to (in out)
> (define (with-in-out-files proc in out)
> (with-input-from-file in
> (lambda ()
> (with-output-to-file out proc #:mode 'text #:exists 'replace))))
>
> (define (replace-dot-in-file in)
> (define (dot->comma)
> (for ([l (in-lines)])
> (printf "~a\n" (regexp-replace #rx"\\." l ","))))
> (define (out-filename file suffix)
> (define file-ext (regexp-split #rx"\\." file))
> (string-append (first file-ext) suffix "." (second file-ext)))
> (with-in-out-files dot->comma in (out-filename in "-ms")))
>
> ;; good candidate to exploit that multi-core bazooka
> ;; should I be using futures / places / threads and
> ;; some kind of parallel map here?
> (time (for-each replace-dot-in-file files))
>
> Thanks
>
>
>
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users
>
>
>
> ____________________
> Racket Users list:
> http://lists.racket-lang.org/users
>