[plt-scheme] Seeing some weird endianness issues on Solaris x86 platform

From: Danny Yoo (dyoo at hkn.eecs.berkeley.edu)
Date: Fri Nov 18 03:13:13 EST 2005

Hi everyone,


I just got a shiny new Sun Ultra 20 workstation.  Unfortunately, since
it's running Solaris, it's useless because it doesn't have PLT Scheme
installed on it.  I'm trying to fix that.  *grin*


The system is a x86 Opteron system running Solaris 10.  I'm running into
what looks like an encoding issue with the functions that go between paths
and strings:

;;;;;;
> (path->string (string->path "hello"))
"\U3F000000\U3F000000\U3F000000\U3F000000\U3F000000"
> (path->bytes (bytes->path (string->bytes/utf-8 "hello")))
#"hello"
;;;;;;

The bytes in the high end appear to represent the characters, but there's
something bizarre going on here.


If I just munge up current-locale to #f, then everything is happy:

######
> (current-locale #f)
> (path->string (string->path "hello"))
"hello"
######



I've been reading the mzscheme source code, and I think that it has
something to do with the way locales are handled on my system, though I
haven't been able to pinpoint it yet.  In file.c, if I kludge file.c
slightly:

Index: src/file.c
===================================================================
--- src/file.c	(revision 1343)
+++ src/file.c	(working copy)
@@ -598,7 +598,7 @@
   }
 #endif

-  s = scheme_byte_string_to_char_string_locale(p);
+  s = scheme_byte_string_to_char_string(p);

   if (!SCHEME_CHAR_STRLEN_VAL(s))
     return scheme_make_utf8_string("?");


then things work a little better --- I get the good result from the
experiment with string->path + path->string --- but I know this is the
wrong way to fix this.


My current locale is set to the system default:

######
> (current-locale)
""
######

So I think that there's must be some assymmetry between the treatment of
scheme_byte_string_to_char_string_locale and
scheme_char_string_to_byte_string_locale on my system, but my brain is a
little sleep-deprived; I'll stop for the moment and look at this tomorrow.


Best of wishes!



Posted on the users mailing list.