[racket] regexp-replace* allows no sub-expressions?

From: synx (plt at synx.us.to)
Date: Sat Jun 26 18:38:49 EDT 2010

Okay what. I've long relied on the behavior of regexp-replace* to single
out a number of substrings within a big string, and let me mangle those
substrings in a programmatic fashion. Example:

(regexp-replace*
  #rx"([a-z]+) ([a-z]+)" "red fox, blue seal. red trout, blue trout!"
  (lambda (total color what)
    (cond
      ((equal? color "red") (string-append what " in socks"))
      ((equal? what "trout") (string-append color " fish"))
      (else (string-append color " " what)))))

=> "fox in socks, blue seal. trout in socks, blue fish!"

Recently every program of mine that did that, they all started erroring
out on contract errors. I took a look at the code and was shocked to
find this in racket/private/string.rkt:
(check
  replacement
  (sub buf mstart mend))

which effectively becomes (replacement (sub buf mstart mend)).

It gives no access to subexpressions, and only matches on the entire
pattern. For patterns like #rx"(someheader*)([0-9a-zA-Z]+)(somefooter*)"
that's especially nasty, since I need those three fields separate, but
there might not be a particular character sequence uniquely separating them.

Is there no way to replace all instances of an expression in a string
with some combination of its sub-expressions? I really can't use the
"\\1 \\2" stuff since I'm replacing stuff that could either be a hex
number, a decimal number, or a standard abbreviated name. Was this
support removed for a reason, or was it just too much of a hassle to
figure out? Is there some other procedure that does this now?


Posted on the users mailing list.