[plt-scheme] Unicode strings in mzscheme

From: Richard Cobbe (cobbe at ccs.neu.edu)
Date: Sun Apr 22 14:47:09 EDT 2007

On Sun, Apr 22, 2007 at 06:17:19PM +0100, Ian Oversby wrote:
>  Hi,
>
>  This is DrScheme:
>
>  (display "más")
>  (newline)
>
>  (string-ref "más" 1)
>
>  And the result:
>
>  Welcome to DrScheme, version 360.
>  Language: Pretty Big (includes MrEd and Advanced Student).
>  más
>  #\á
>
>  And this is MzScheme:
>
>  C:\>mzscheme
>  Welcome to MzScheme version 360, Copyright (c) 2004-2006 PLT Scheme Inc.
> > (string-ref "más" 1)
>  #\?

DISCLAIMER: I'm not an expert in any of this stuff, so what follows may be
misleading or even completely incorrect.  But, because I'd like to
understand it better, I'll throw out my thoughts.

I suspect that this is an issue not with PLT's support for Unicode strings,
but rather with Unicode I/O.  In the case of mzscheme, AFAICT, console I/O
relies heavily on the Unicode capabilities of the console in which MzScheme
is running.  (DrScheme Just Works because it effectively provides its own
console.)

For instance, if I run MzScheme in a vanilla xterm on my OS X 10.4.9
machine, I get the MzScheme results you described above.  If, however, I
run in a uxterm, I get the desired results: MzScheme prints the result of
the string-ref application as #\á.  (A uxterm is just like an xterm, except
that it properly configures the various locale environment variables and
possibly fonts such that it can handle Unicode chars, rather than vanilla
xterm's plain ISO-8859-1.)

Unfortunately, I can't translate this to Windows.  It wouldn't surprise me
to learn that cmd.exe, or the graphical window that sits on top of that,
can't handle Unicode I/O.  But I don't know how to get a terminal that
does.

You can test my theories by trying to do file I/O, rather than console I/O,
in MzScheme, and then looking at the results in a UTF-8 capable editor.
(I'm sorry I didn't test this before posting.  I'd do it myself, but I
suspect you'd have a much easier time of it than I would, since you
presumably know the right tools to use.)

HTH,

Richard


Posted on the users mailing list.