[plt-scheme] unexpected regexp behaviour

From: Eli Barzilay (eli at barzilay.org)
Date: Sun Jun 15 21:46:36 EDT 2008

On Jun 15, Martin DeMello wrote:
> > (define a "hello\nworld\n")
> > (regexp-match* (regexp "(?m:^..)") a)
> ("he" "ll" "wo" "rl")
> 
> expected ("he" "wo")
> 
> word boundaries are behaving oddly too:
> 
> > (regexp-match* (pregexp "(?m:\\b..)") a)
> ("he" "ll" "wo" "rl")

This consistent with (and a result of) the way `regexp-match' works
when you begin in the middle of the string:

  > (regexp-match #rx"(?m:^..)" "hello" 2)
  ("ll")
  > (regexp-match #px"\\b.." "hello" 2)
  ("ll")

I believe that treating the string as if it begins at the third
character in these cases makes this more uniform with matching on
ports, where a `start' argument means that you discard some of the
input, and then start matching.

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the users mailing list.