[racket] Unicode regexp character classes?
Thank you, yes, that is what I was looking for.
I must have read right over it 3-4 times without seeing it.
On 08/04/2012 02:59 PM, Matthew Flatt wrote:
> I think you're looking for #px"\\p{L}".
>
> See the "\p"<atom> production and the<category> non-terminal in
>
> http://docs.racket-lang.org/reference/regexp.html#(part._regexp-syntax)
>
> At Sat, 04 Aug 2012 14:45:30 -0700, Charles Hixson wrote:
>
>> Are there any unicode regular expression character classes?
>>
>> I'm hoping for something similar to [:alpha:], etc. that are based
>> around, say, the first letter of the unicode character classification.
>> I *can* do what I want by disassembling strings by hand and using tests
>> based on char-general-category, but a regular expression would (should?)
>> be much neater.
>>
>> (I know that these aren't mentioned in the documentation, but it just
>> says that it's talking about the "Frequently Used Character Classes",
>> not that there aren't any others.)
>>
>> --
>> Charles Hixson
>>
>
>
--
Charles Hixson