[racket] Some design "whys" of regexps in Racket

From: Carl Eastlund (cce at ccs.neu.edu)
Date: Fri Jun 3 22:47:28 EDT 2011

On Fri, Jun 3, 2011 at 10:36 PM, Jay McCarthy <jay.mccarthy at gmail.com> wrote:
> 2011/6/3 Rodolfo Carvalho <rhcarvalho at gmail.com>:
>> Hello,
>> I'm curious about 2 design decisions made:
>> 1) Why do I have to escape things like "\d{2}" -> "\\d{2}"?
>
> Because "\d" is the same as "d", because you're escaping the character
> #\d. But, the syntax for the regular expression is #\backslash #\d, so
> you need to get a backslash in there, which is otherwise a control
> character, so you need to escape it.

To clarify this, Racket reuses the syntax of strings for regexps.  A
regexp is first read as a string, then parsed into a regular
expression.  So if "\d" is the same as "d" as a string, the regexp
parser never sees the backslash.  We do not currently have a reader
for regexps that skips this intermediate step.

--Carl


Posted on the users mailing list.