[racket] Unicode regexp character classes?

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Sat Aug 4 17:59:08 EDT 2012

I think you're looking for #px"\\p{L}".

See the "\p" <atom> production and the <category> non-terminal in

   http://docs.racket-lang.org/reference/regexp.html#(part._regexp-syntax)

At Sat, 04 Aug 2012 14:45:30 -0700, Charles Hixson wrote:
> Are there any unicode regular expression character classes?
> 
> I'm hoping for something similar to [:alpha:], etc. that are based 
> around, say, the first letter of the unicode character classification.  
> I *can* do what I want by disassembling strings by hand and using tests 
> based on char-general-category, but a regular expression would (should?) 
> be much neater.
> 
> (I know that these aren't mentioned in the documentation, but it just 
> says that it's talking about the "Frequently Used Character Classes", 
> not that there aren't any others.)
> 
> -- 
> Charles Hixson


Posted on the users mailing list.