[plt-scheme] mzchar and wchar_t

From: Jim Blandy (jimb at red-bean.com)
Date: Tue Mar 14 18:55:54 EST 2006

On 3/14/06, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> At Tue, 14 Mar 2006 12:36:05 +0100, Jean-Guillaume wrote:
> > Is there a simple relation between mzchar and wchar_t (and wint_t)
> > types ?
>
> If you have a wchar_t whose value that is in [0, #xD7FF] or [#xE000,
> #xFFFF], then you can use it as a mzchar and vice-versa.
>
> More generally, a mzchar is a 4-byte value that is a Unicode scalar
> value, and a wchar_t is a 2-byte value that is potentially a surrogate.
> You can convert between strings of each type of character by UTF-16
> decoding/decoding, perhaps using scheme_utf8_encode() and
> scheme_utf8_decode() (which, despite the names, support a UTF-16 mode).
> Beware of decoding a wchar_t string that may have unpaired surrogates.

So MzScheme assumes that wchar_t is a UTF-16 code unit?


Posted on the users mailing list.