[plt-scheme] Lexical char-downcase for extended character sets

From: Erich Rast (Erich.Rast at t-online.de)
Date: Tue Oct 8 11:15:40 EDT 2002

I need to index text documents containing character values in MacRoman 
 > 128. Now I've encountered the following problems:

1.) For case-insensitive  index keys, I intended to use char-downcase, 
but this doesn't work like expected for german Umlaute: (char-downcase 
#\Ä)==>#\304, which is the same as #\Ä, but I'd need #\344 also known 
as #\ä (the small letter 'a' with two dots on top).

2.) Is there a simple way to convert a high character range to some 
reasonable lexicographic mapping into ASCII. Examples: Ä=>A, ü=>u. Or 
do I need to build substitution tables?

Regards,

Erich


Posted on the users mailing list.