[racket] Regex for blank line?

From: Matthias Felleisen (matthias at ccs.neu.edu)
Date: Thu Jun 9 10:31:48 EDT 2011

I have used string-trim for this from srfi/13 in conjunction with (string=? "" ...): 

> string-trim-both  s [char/char-set/pred start end] -> string
> Trim s by skipping over all characters on the left / on the right / on both sides that satisfy the second parameter char/char-set/pred:
> 		• if it is a character char, characters equal to char are trimmed;
> 		• if it is a char set cs, characters contained in cs are trimmed;
> 		• if it is a predicate pred, it is a test predicate that is applied to the characters in s; a character causing it to return true is skipped.
> Char/char-set/pred defaults to the character set char-set:whitespace defined in SRFI 14.
> If no trimming occurs, these functions may return either s or a copy of s; in some implementations, proper substrings may share memory with s.
> 
> (string-trim-both "  The outlook wasn't brilliant,  \n\r")
>     => "The outlook wasn't brilliant,"


I know it isn't indexed and whatever. It should be. 

-- Matthias




On Jun 8, 2011, at 2:13 PM, Richard Lawrence wrote:

> Hi everyone,
> 
> I'm sure this is a really trivial question, but I've been trying on my
> own for some time now, and I can't quite figure it out.  I am trying to
> define a pair of functions, skip-whitespace and skip-blank-line, that do
> the following:
> 
> - skip-whitespace should consume any whitespace characters from an input
>  port, possibly up to and including a single newline, but it should not
>  consume any more whitespace after a newline--i.e., it should not skip a
>  blank line in the input
> 
> e.g., 
> (define ip (open-input-string "  ABC")) 
> (define ip2 (open-input-string "  \n\t\nABC"))
> (define ip3 (open-input-string "ABC"))
> (skip-whitespace ip) (skip-whitespace ip2) (skip-whitespace ip3)
> (peek-char ip) ; should be #\A
> (peek-char ip2) ; should be #\tab
> (peek-char ip3) ; should be #\A
> 
> - skip-blank-line should consume whitespace characters from an input
>  port just in case that sequence of whitespace characters ends in a
>  newline, and not consume any input otherwise
> 
> e.g.,
> (define ip (open-input-string "  ABC")) 
> (define ip2 (open-input-string "  \n\t\nABC"))
> (define ip3 (open-input-string "ABC"))
> (skip-blank-line ip) (skip-blank-line ip2) (skip-blank-line ip3)
> (peek-char ip) ; should be #\space
> (peek-char ip2) ; should be #\tab
> (peek-char ip3) ; should be #\A
> 
> Both functions should return a boolean value indicating whether any
> input was consumed.
> 
> Here's what I've got for skip-whitespace: 
> 
> #lang typed/racket
> (: skip-whitespace (Input-Port -> Boolean))
> (define (skip-whitespace in)
>  ; matches whitespace up to and including a newline, but 
>  ; doesn't skip blank lines
>  (if (try-read #px"^[[:blank:]]*[[:space:]]?" in) #t #f))
> 
> ; NOTE: try-read is a simple wrapper for regexp-try-match with type:
> ; (U String Regexp PRegexp) Input-Port -> (U String False)
> 
> This works fine. But I can't figure out how to write the parallel regexp
> for skip-blank-line.  All the regexps I can come up with either read too
> much whitespace or too little.
> 
> #lang typed/racket
> (: skip-blank-line (Input-Port -> Boolean))
> (define (skip-blank-line in)
>  (if (try-read #px"^[[:blank:]]*$" in) #t #f))
> 
> This consumes too little in the second case: it doesn't consume the
> initial spaces and newline of ip2; the next char is #\space rather than
> #\tab.  (The same is true if I change the character class :blank: to
> :space:.)
> 
> If I change the regexp to #px"^[[:blank:]]*[[:space:]]", it consumes too
> much in the first case:  the next char of ip is #\A rather than #\space.
> 
> (I think this second regexp is closer to what I need, but what I could
> really use is a character class that just matches line-terminators,
> instead of :space:.  That seems to be the job of "\\p{Zl}", but I guess
> there's something I don't understand about that, because (regexp-match
> #px"\\p{Zl}" "\n") doesn't match anything.)
> 
> I feel pretty lost here.  Any help would be very much appreciated.
> 
> Thanks!
> 
> Richard
> 
> 
> 
> _________________________________________________
>  For list-related administrative tasks:
>  http://lists.racket-lang.org/listinfo/users




Posted on the users mailing list.