[plt-scheme] Bug in regexp-match*, regexp-split, regexp-match-positions*, maybe others...?
This is fixed in SVN. It was a problem with the way `regexp-split' and
`regexp-match*' convert a byte regexp to avoid empty start and end
matches.
Thanks for the report!
At Sat, 26 Sep 2009 02:53:39 -0400, Jon Zeppieri wrote:
> Byte string regexp patterns containing bytes with the high bit set don't
> seem to work properly with any of the regexp procedures that match multiple
> times. For example...
>
> This works as expected:
>
> > (regexp-split #rx#"\x7f" #"hello\x7fworld")
> (#"hello" #"world")
>
> But this does not:
>
> > (regexp-split #rx#"\x80" #"hello\x80world")
> (#"hello\200world")
>
>
> Similarly:
>
> > (regexp-match* #rx#"\x7f" #"hello\x7fworld")
> (#"\177")
>
> > (regexp-match* #rx#"\x80" #"hello\x80world")
> ()
>
>
> This doesn't affect the procedures that only match once. For example, this
> works fine:
>
> > (regexp-match #rx#"\x80" #"hello\x80world")
> (#"\200")
>
>
> I can reproduce this behavior in 4.1.5 and 4.2.2.1, both on OS X 10.5.8.
>
> -Jon