[plt-scheme] 299.7

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Wed May 12 12:17:49 EDT 2004

At Wed, 12 May 2004 17:06:58 +0400, Alex Ott wrote:
> What do you think about C++ aproach to encoding and locale conversion for
> ports and strings?

If I understand C++'s approach to ports/streams correctly, the encoding
between strings and bytes is a (mutable) property of the stream object.

In MzScheme 299, functions like `read-char' and `write-char' assume a
UTF-8 decoding of the stream. To get a different kind of decoding, you
wrap the port to convert between UTF-8 and the other encoding. (In the
near future, `(lib "port.ss")' will provide a port->port wrapper.)

For obvious reasons, I would be opposed to a mutable property on ports
to specify the encoding. I did consider making a port's encoding a
property that is specified when the port is created, but that's too
early in many cases --- exactly the mess of having distinguish text and
binary streams.

To put it another way, MzScheme's strategy is more like Java's, where a
basic stream works in terms of bytes, and conversions are implemented
by wrappers on basic streams. Technically, MzScheme's protocol can
introduce wasted work (e.g., an encode followed by an immediate
decode), but the advantage is that there's only one kind of port. We
don't have to add `current-byte-input-port', `open-input-byte-port',
etc.

Matthew



Posted on the users mailing list.