[plt-scheme] regexp-match on input port returning bytes

From: Carl Eastlund (carl.eastlund at gmail.com)
Date: Thu Dec 25 13:15:10 EST 2008

>From the reference manual on regexp-match:

> If the match fails, #f is returned. If the match succeeds, a list containing strings or byte string,
> and possibly #f, is returned. The list contains strings only if input is a string and pattern is not a
> byte regexp value. Otherwise, the list contains byte strings (substrings of the UTF-8 encoding of
> input, if input is a string).

The input in the second case is a port, rather than a string, so
regexp-match uses byte strings rather than (character) strings.

On Thu, Dec 25, 2008 at 4:19 AM, Neil Van Dyke <neil at neilvandyke.org> wrote:
> Just out of curiosity, in PLT 4.1.2, why does "regexp-match" on an input
> port return bytes rather than strings, when using a string regexp (not a
> bytes regexp)?
>
>> (regexp-match #rx"^x" "x")
> ("x")
>> (regexp-match #rx"^x" (open-input-string "x"))
> (#"x")
>> (regexp-match #rx#"^x" (open-input-string "x"))
> (#"x")
>
> I would've expected the second example to yield a list of strings rather
> than a list of bytes.
>
> Thanks,
> Neil

-- 
Carl Eastlund


Posted on the users mailing list.