[plt-scheme] mzchar and wchar_t
On 3/14/06, Matthew Flatt <mflatt at cs.utah.edu> wrote:
> At Tue, 14 Mar 2006 12:36:05 +0100, Jean-Guillaume wrote:
> > Is there a simple relation between mzchar and wchar_t (and wint_t)
> > types ?
>
> If you have a wchar_t whose value that is in [0, #xD7FF] or [#xE000,
> #xFFFF], then you can use it as a mzchar and vice-versa.
>
> More generally, a mzchar is a 4-byte value that is a Unicode scalar
> value, and a wchar_t is a 2-byte value that is potentially a surrogate.
> You can convert between strings of each type of character by UTF-16
> decoding/decoding, perhaps using scheme_utf8_encode() and
> scheme_utf8_decode() (which, despite the names, support a UTF-16 mode).
> Beware of decoding a wchar_t string that may have unpaired surrogates.
So MzScheme assumes that wchar_t is a UTF-16 code unit?