[racket] Regex for blank line?
I have used string-trim for this from srfi/13 in conjunction with (string=? "" ...):
> string-trim-both s [char/char-set/pred start end] -> string
> Trim s by skipping over all characters on the left / on the right / on both sides that satisfy the second parameter char/char-set/pred:
> • if it is a character char, characters equal to char are trimmed;
> • if it is a char set cs, characters contained in cs are trimmed;
> • if it is a predicate pred, it is a test predicate that is applied to the characters in s; a character causing it to return true is skipped.
> Char/char-set/pred defaults to the character set char-set:whitespace defined in SRFI 14.
> If no trimming occurs, these functions may return either s or a copy of s; in some implementations, proper substrings may share memory with s.
>
> (string-trim-both " The outlook wasn't brilliant, \n\r")
> => "The outlook wasn't brilliant,"
I know it isn't indexed and whatever. It should be.
-- Matthias
On Jun 8, 2011, at 2:13 PM, Richard Lawrence wrote:
> Hi everyone,
>
> I'm sure this is a really trivial question, but I've been trying on my
> own for some time now, and I can't quite figure it out. I am trying to
> define a pair of functions, skip-whitespace and skip-blank-line, that do
> the following:
>
> - skip-whitespace should consume any whitespace characters from an input
> port, possibly up to and including a single newline, but it should not
> consume any more whitespace after a newline--i.e., it should not skip a
> blank line in the input
>
> e.g.,
> (define ip (open-input-string " ABC"))
> (define ip2 (open-input-string " \n\t\nABC"))
> (define ip3 (open-input-string "ABC"))
> (skip-whitespace ip) (skip-whitespace ip2) (skip-whitespace ip3)
> (peek-char ip) ; should be #\A
> (peek-char ip2) ; should be #\tab
> (peek-char ip3) ; should be #\A
>
> - skip-blank-line should consume whitespace characters from an input
> port just in case that sequence of whitespace characters ends in a
> newline, and not consume any input otherwise
>
> e.g.,
> (define ip (open-input-string " ABC"))
> (define ip2 (open-input-string " \n\t\nABC"))
> (define ip3 (open-input-string "ABC"))
> (skip-blank-line ip) (skip-blank-line ip2) (skip-blank-line ip3)
> (peek-char ip) ; should be #\space
> (peek-char ip2) ; should be #\tab
> (peek-char ip3) ; should be #\A
>
> Both functions should return a boolean value indicating whether any
> input was consumed.
>
> Here's what I've got for skip-whitespace:
>
> #lang typed/racket
> (: skip-whitespace (Input-Port -> Boolean))
> (define (skip-whitespace in)
> ; matches whitespace up to and including a newline, but
> ; doesn't skip blank lines
> (if (try-read #px"^[[:blank:]]*[[:space:]]?" in) #t #f))
>
> ; NOTE: try-read is a simple wrapper for regexp-try-match with type:
> ; (U String Regexp PRegexp) Input-Port -> (U String False)
>
> This works fine. But I can't figure out how to write the parallel regexp
> for skip-blank-line. All the regexps I can come up with either read too
> much whitespace or too little.
>
> #lang typed/racket
> (: skip-blank-line (Input-Port -> Boolean))
> (define (skip-blank-line in)
> (if (try-read #px"^[[:blank:]]*$" in) #t #f))
>
> This consumes too little in the second case: it doesn't consume the
> initial spaces and newline of ip2; the next char is #\space rather than
> #\tab. (The same is true if I change the character class :blank: to
> :space:.)
>
> If I change the regexp to #px"^[[:blank:]]*[[:space:]]", it consumes too
> much in the first case: the next char of ip is #\A rather than #\space.
>
> (I think this second regexp is closer to what I need, but what I could
> really use is a character class that just matches line-terminators,
> instead of :space:. That seems to be the job of "\\p{Zl}", but I guess
> there's something I don't understand about that, because (regexp-match
> #px"\\p{Zl}" "\n") doesn't match anything.)
>
> I feel pretty lost here. Any help would be very much appreciated.
>
> Thanks!
>
> Richard
>
>
>
> _________________________________________________
> For list-related administrative tasks:
> http://lists.racket-lang.org/listinfo/users