[plt-scheme] addition to pregexp

From: Jon Rafkind (workmin at ccs.neu.edu)
Date: Sat Mar 18 11:05:01 EST 2006


Eli Barzilay wrote:

>On Mar 17, Matthew Flatt wrote:
>  
>
>>I like this feature.
>>
>>Instead of introducing a new name, how about generalizing
>>`[p]regexp-replace[*]' to accept either a string or a function as
>>its last argument?
>>    
>>
>
>Another point is that it's probably better if the function takes in a
>variable number of arguments -- and it will be fed the matching
>substring and then all parenthesized submatches (as with `match').
>
>  
>
I think this got lost in my mail client..

Ok this version handles variable arguments depending on how many 
submatches occured. I havent tested many edge-cases, although I think 
the only problems would occur when backward referencing is used.

I sort of cheated by testing for string? and if so use the old 
pregexp-replace*.

(require (lib "list.ss"))
(require (lib "string.ss"))
(require (lib "pregexp.ss"))

(define (pregexp-match-positions* regexp str)
  (let loop ((start 0)
             (matches '()))
    (if (>= start (string-length str))
      (reverse matches)
      (let ((match (pregexp-match-positions regexp str start)))
        (if (not match)
          (reverse matches)
          (loop
            (if (eq? (caar match)
                     (cdar match))
              (add1 (cdar match))
              (cdar match))
            (cons match matches)))))))

(define (pregexp-replace% regexp str func)
  (if (string? func)
    (pregexp-replace* regexp str func)
    (let ((matches (pregexp-match-positions* regexp str))
          (last-position 0))
      (string-append
        (apply string-append
               (map (lambda (current-matches)
                      (let ((result
                              (string-append
                                (substring str last-position
                                           (caar current-matches))
                                (apply func
                                       (map (lambda (match)
                                              (substring str
                                                         (car match)
                                                         (cdr match)))
                                            current-matches)))))
                        (set! last-position (cdar current-matches))
                        result))
                    matches))
        (substring str last-position)))))

(define (string-assert str1 str2)
  (unless (string=? str1 str2)
      (printf "~a != ~a\n" str1 str2)))

(string-assert (pregexp-replace% "xyz" "hello" (lambda (x) x)) "hello")
(string-assert (pregexp-replace% "e" "hello" (lambda (x) x)) "hello")
(string-assert (pregexp-replace% "l*" "hello" (lambda (x)
                                                (string-upcase x)))
               "heLLo")

(string-assert (pregexp-replace%
                 "[A-Z]+_?" "SOME_STRING"
                 (lambda (x)
                   (string-append (substring x 0 1)
                                  (string-downcase
                                    (substring
                                      (pregexp-replace "_" x "") 1)))))
               "SomeString")

(string-assert (pregexp-replace%
                 "fo(o)" "Hello foo world"
                 (lambda (all o)
                   (string-upcase o)))
               "Hello O world")

(string-assert (pregexp-replace% "color" "The wall is color" "black")
               "The wall is black")



Posted on the users mailing list.