[racket] Some design "whys" of regexps in Racket

From: Rodolfo Carvalho (rhcarvalho at gmail.com)
Date: Fri Jun 3 22:46:40 EDT 2011

On Fri, Jun 3, 2011 at 23:36, Jay McCarthy <jay.mccarthy at gmail.com> wrote:

> 2011/6/3 Rodolfo Carvalho <rhcarvalho at gmail.com>:
> > Hello,
> > I'm curious about 2 design decisions made:
> > 1) Why do I have to escape things like "\d{2}" -> "\\d{2}"?
>
> Because "\d" is the same as "d", because you're escaping the character
> #\d. But, the syntax for the regular expression is #\backslash #\d, so
> you need to get a backslash in there, which is otherwise a control
> character, so you need to escape it.
>


Well, this I understand - "what it is doing there".
However, it seems to me that matching a digit is more interesting than
matching a backslash followed by a #\d (similar to the others, \w, \s, ...)
When I need to match a backslash I can always escape it.

So, for me is more "natural" to have "\d" match a digit and "\\d" match a
backslash-d.

What motivates choosing the way Racket does it?




>
> >
> > 2) Why there are two kinds of regexps, #rx and #px?
> > #rx"\\d{2}" doesn't work because the curly braces are part of just
> pregexps
> > grammar...
>
> #px is more powerful than #rx, but that power costs. Most regexps can
> get by with what #rx has.
>


Ah! Now I can think of #px"..." as power-regexp... before this "p" was
a mystery. Thanks.

How high is the cost? Performance x pragmatism... (btw I noticed something
similar with #lang racket vs #lang racket/base)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/users/archive/attachments/20110603/92ddb44f/attachment.html>

Posted on the users mailing list.