[racket] Some design "whys" of regexps in Racket
On Fri, Jun 3, 2011 at 10:36 PM, Jay McCarthy <jay.mccarthy at gmail.com> wrote:
> 2011/6/3 Rodolfo Carvalho <rhcarvalho at gmail.com>:
>> Hello,
>> I'm curious about 2 design decisions made:
>> 1) Why do I have to escape things like "\d{2}" -> "\\d{2}"?
>
> Because "\d" is the same as "d", because you're escaping the character
> #\d. But, the syntax for the regular expression is #\backslash #\d, so
> you need to get a backslash in there, which is otherwise a control
> character, so you need to escape it.
To clarify this, Racket reuses the syntax of strings for regexps. A
regexp is first read as a string, then parsed into a regular
expression. So if "\d" is the same as "d" as a string, the regexp
parser never sees the backslash. We do not currently have a reader
for regexps that skips this intermediate step.
--Carl