[plt-scheme] Why do MzScheme ports not respect the locale's encoding by default?

From: Michael Sperber (sperber at informatik.uni-tuebingen.de)
Date: Mon Feb 28 10:44:54 EST 2005

>>>>> "Jim" == Jim Blandy <jimb at redhat.com> writes:

Jim> My claim is that it's impossible to precisely specify the behavior of
Jim> mixed byte and character reads on a port if the character encoding
Jim> doesn't have some restrictions imposed on it.  It can't be left
Jim> completely unspecified. 

Sure it needs to be specified---but I don't think it needs to be
*restricted* in unreasonable ways.  Somebody needs to sit down and say
*per encoding* (or per encoding conversion) what bytes a READ-CHAR
will remove from the port.  This happens to be easy for the various
Unicode encodings, and that's what should guide the design.

Jim> 2) Amend SRFI-56 to restrict ports to be either char-only or
Jim>    byte-only.

That just seems totally unacceptable to me---there are so many
applications where byte and character data is interleaved, and where
there aren't any semantic issues.  (Specifically in the PLT media
editors.)  The sheer existence of all the SHIFT-JIS crap (which a lot
of people, including a lot in the multi-language encoding business
don't care about at all) shouldn't make things hard for everyone.

Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla

Posted on the users mailing list.