[plt-scheme] Lexical char-downcase for extended character sets
At Tue, 8 Oct 2002 17:15:40 +0200, Erich Rast wrote:
> I need to index text documents containing character values in MacRoman
> > 128. Now I've encountered the following problems:
>
> 1.) For case-insensitive index keys, I intended to use char-downcase,
> but this doesn't work like expected for german Umlaute: (char-downcase
> #\Ä)==>#\304, which is the same as #\Ä, but I'd need #\344 also known
> as #\ä (the small letter 'a' with two dots on top).
It's possible that you can set `current-locale' so that
`char-locale-downcase' converts using MacRoman. But a quick check
suggests that no such locale mapping exists.
In case it's useful, there's a MacRoman -> Latin-1 table in
plt/src/mzscheme/src/mac_roman.inc
> 2.) Is there a simple way to convert a high character range to some
> reasonable lexicographic mapping into ASCII. Examples: Ä=>A, ü=>u. Or
> do I need to build substitution tables?
I can't think of any existing function that would perform that
conversion.
Matthew