[plt-scheme] Lexical char-downcase for extended character sets
I need to index text documents containing character values in MacRoman
> 128. Now I've encountered the following problems:
1.) For case-insensitive index keys, I intended to use char-downcase,
but this doesn't work like expected for german Umlaute: (char-downcase
#\Ä)==>#\304, which is the same as #\Ä, but I'd need #\344 also known
as #\ä (the small letter 'a' with two dots on top).
2.) Is there a simple way to convert a high character range to some
reasonable lexicographic mapping into ASCII. Examples: Ä=>A, ü=>u. Or
do I need to build substitution tables?
Regards,
Erich