[racket] programmatic file editing

From: Neil Van Dyke (neil at neilvandyke.org)
Date: Wed Jun 6 18:26:02 EDT 2012

If anyone is interested in programmatic file editing, please feel free 
to comment on this library interface I've just sketched out.

My immediate need is that I want McFly to automatically edit an existing 
"info.rkt" file in some circumstances, such as to add necessary things 
that the programmer missed, and to update the release notes for PLaneT.

The idea is that we (the programmer) write code to parse the file to 
syntax objects, determine what we want to change about the file, and 
then call "progedit-file" with a list of insertions, deletions, and 
replacements.

"progedit-file" would then then use this information to make only the 
edits to the file we specified, preserving the rest of the file 
verbatim.  (This is different than, say, reading a file as a list of 
syntax-objects, modifying the syntax objects, and writing them out, 
since in that approach we might lose whitespace, comments, and 
peculiarities of syntax that is normalized by the reader.)

For convenience, when specifying an insertion, deletion, or replacment, 
we can use the syntax objects from the original parse to say, for 
example "replace this old expression in the code with this new 
expression", since the syntax object already has the positional 
information needed.

So, here's a toy example that shows much of the language:

(progedit-file
  "info.rkt"

  #:insert
  `(
    ;; Add a missing #lang line to the top of the file:
    (0 "#lang setup/infotab\n")

    ;; Add something to very end of file:
    (end "\n(define foo 'bar)\n")

    ;; Add a warning comment after an expression:
    ((after ,stx-for-something-we-parsed)
     "\n;; Warning: The above is certainly wrong.\n"))

  #:replace
  (list

   ;; Replace part of an expression:
   (list stx-for-old-value-we-parsed "'baz")

   ;; Replace an expression with a bunch of other stuff.
   (list stx-for-another-old-value-we-parsed
         #"(+ 1 2)\n"
         some-str
         some-stx
         some-list-of-nested-lists-of-stuff
         input-port-for-some-other-file
         "\n(+ 3 4)")

   ;; Get rid of an offensive block of code.
   `(((before ,stx-1) . (after stx-2))
     "\n\n;; Some nonsense used to be here.\n\n")))

And the main public interface...

(define (progedit-port in-port
                        out-port
                        #:char?   (char?    #t)
                        #:offset  (offset   0)
                        #:insert  (inserts  '())
                        #:delete  (deletes  '())
                        #:replace (replaces '()))
   ...)

(define (progedit-file
          path
          #:encoding (encoding    #f)
          #:backup   (backup-proc default-progedit-file-backup)
          #:char?    (char?       #t)
          #:offset   (offset      0)
          #:insert   (inserts     '())
          #:delete   (deletes     '())
          #:replace  (replaces    '()))
   ...)

;; INSERTS ::= (INSERT ...)
;;
;; DELETES ::= (DELETE ...)
;;
;; REPLACES ::= (REPLACE ...)
;;
;; INSERT ::= (POSITION . INSERT-CONTENT)
;;
;; INSERT-CONTENT ::= STRING
;;                 |  BYTE-STRING
;;                 |  SYNTAX
;;                 |  INPUT-PORT
;;                 |  PROC-ACCEPTING-OUTPUT-PORT
;;                 |  (INSERT-CONTENT . INSERT-CONTENT)
;;                 |  ()
;;
;; POSITION ::= NATURAL-NUMBER
;;           |  end
;;           |  (before SYNTAX)
;;           |  (after  SYNTAX)
;;
;; DELETE ::= (POSITION . POSITION)
;;         |  SYNTAX
;;
;; REPLACE ::= (DELETE . INSERT-CONTENT)

The implementation is easy.  The insert/delete/replace language would be 
compiled to a sequence of instructions, each which instruction either 
copies N characters/bytes from input to output, or write X content to 
output.

I thought I'd bounce this off anyone who was interested to see if I 
missed some useful feature that's easier to implement from the start 
than to implement after.

Neil V.


Posted on the users mailing list.