[racket] Some design "whys" of regexps in Racket

From: Jens Axel Søgaard (jensaxel at soegaard.net)
Date: Sat Jun 4 10:27:02 EDT 2011

2011/6/4 Rodolfo Carvalho <rhcarvalho at gmail.com>:
> Hello,
> I'm curious about 2 design decisions made:
> 1) Why do I have to escape things like "\d{2}" -> "\\d{2}"?

You can actually avoid escaping if you use here strings.
See the example below.

Source begins here:

#lang racket

(define the-text-to-be-searched #<<END
This is an example of a "here string".
In a here string no escaping is needed.
This is a backslash \ with no escaping.
This is a normal slash /.
And here is another backslash \.
An here string begins with #<< and is followed by
user chosen stop word, which signals the end of
the here string.

Note that the regexp used to match a backslash is \\.
Thus to search for all lines in this text containing
a backslash, one write the regexp \\ using here
syntax with just two backslashes. Within the
normal string syntax a backslash must be escaped
and since the escape char is \ one must write "\\\\"
in order to get the same regular expression.
END
)

(define backslash-regexp  #<<END
\\
END
)

(define lines (regexp-split (regexp "\n") the-text-to-be-searched))

; The following three expressions evaluate to equivalent values.

(filter (λ (s) (regexp-match (regexp backslash-regexp) s))
        lines)

(filter (λ (s) (regexp-match (regexp "\\\\") s))
        lines)

(filter (λ (s) (regexp-match #rx"\\\\" s))
        lines)

Source ends here.

--
Jens Axel Søgaard



Posted on the users mailing list.