[plt-scheme] addition to pregexp

From: Jon Rafkind (workmin at ccs.neu.edu)
Date: Thu Mar 16 20:59:45 EST 2006

Can the following function be added to pregexp.ss or string.ss( or 
whatever the best regexp libary in mzscheme is )?

I wrote this using pregexp, though.

(pregexp-replace% regexp str func)

Which works exactly like pregexp-replace* except the replacement string 
for each match is the result of (func match). This is handy to do things 
like uppercase the matched pattern and other such things. One other 
language I know of( ruby ) has this and I find it pretty useful. I 
thought mzscheme would like it as well.

Heres my attempt at its implementation:

(require (lib "list.ss"))
(require (lib "string.ss"))
(require (lib "pregexp.ss"))

;; return a list of pairs, each pair equal to the start and end of a match
(define (pregexp-match-positions* regexp str)
  (let loop ((start 0)
	     (matches '()))
    (if (>= start (string-length str))
      (reverse matches)
      (let ((match (pregexp-match-positions regexp str start)))
	(if (not match)
	  (reverse matches)
	  (loop 
	    (if (eq? (caar match)
		     (cdar match))
	      (add1 (cdar match))
	      (cdar match))
	    (cons (car match) matches)))))))

;; replace all matches of `regexp' in `str' with the result of `func' applied to
;; each match
(define (pregexp-replace% regexp str func)
  (let ((matches (pregexp-match-positions* regexp str))
        (last-position 0))
    (string-append
      (apply string-append
             (map (lambda (match)
                    (let ((start (car match))
                          (end (cdr match)))
                      (let ((result (string-append
                                      (substring str last-position start)
                                      (func (substring str start end)))))
                        (set! last-position end)
                        result)))
                  matches))
      (substring str last-position)))) 

(define (string-assert str1 str2)
  (unless (string=? str1 str2)
      (printf "~a != ~a\n" str1 str2)))

(string-assert (pregexp-replace% "xyz" "hello" (lambda (x) x)) "hello")
(string-assert (pregexp-replace% "e" "hello" (lambda (x) x)) "hello")
(string-assert (pregexp-replace% "l*" "hello" (lambda (x) 
						(string-upcase x)))
	       "heLLo")

(string-assert (pregexp-replace% 
		 "[A-Z]+_?" "SOME_STRING"
		 (lambda (x)
		   (string-append (substring x 0 1)
				  (string-downcase
				    (substring
				      (pregexp-replace "_" x "") 1)))))
	       "SomeString")




Posted on the users mailing list.