[plt-scheme] Why do MzScheme ports not respect the locale's encoding by default?

From: Alex Shinn (foof at synthcode.com)
Date: Mon Feb 14 22:02:45 EST 2005

At 14 Feb 2005 10:13:39 -0500, Jim Blandy wrote:
> 
> The MzScheme manual, section 1.2.2, "Locale", says that by default,
> input and output ports don't translate between Unicode, used
> internally by MzScheme, and the current locale's encoding.
> 
> Why is this?  After all, the locale is chosen by the user.  The system
> administrator can set a default, but the user can override that.  It
> seems as if MzScheme is ignoring the user's stated preference.

Note the Java-based Schemes all convert from user's locale encoding,
so it may be worth looking at how they deal with any expected
problems.

Gambit instead uses a global setting specified at compile/start time
as to what encoding files ports use, and a separate setting for
terminal ports, but only supports a few limited encodings (Unicode-
based and Latin-1).

Gauche takes the same approach as PLT, requiring explicit conversions.

To me the most undesirable consequence of automatic conversion is that
the same script could behave differently and break in other locals.

> - Sometimes you need to mix byte and character operations.
>   Translating by default makes that impossible.

This seems like an orthogoanl issue.  Have you looked at SRFI-56?
Assuming you allow mixed byte and character operations at all, you
should be able to send raw bytes regardless of the port's character
encoding.

-- 
Alex



Posted on the users mailing list.