[plt-scheme] strings?

From: Mike (mikee at mikee.ath.cx)
Date: Sat Apr 16 13:40:51 EDT 2005

On Sat, 16 Apr 2005, Eli Barzilay might have said:

> On Apr 16, Mike wrote:
> > On Fri, 15 Apr 2005, Matthew Flatt might have said:
> > 
> > > SCHEME_BYTE_STR_VAL() should be used only on a value for which
> > > SCHEME_BYTE_STRINGP() produces true.
> > > 
> > > SCHEME_CHAR_STR_VAL() should be used only on a value for which
> > > SCHEME_CHAR_STRINGP() produces true.
> > > 
> > > The result of SCHEME_CHAR_STR_VAL() is a mzchar*, so I can see why
> > > you're trying to use SCHEME_BYTE_STR_VAL() to get a char*, but it
> > > doesn't work. (The actual layout of byte string and char string values
> > > means that, on a little-endian machine, you end up with one byte when
> > > trying to use a char string as a byte string. But that's just an
> > > artifact of the current data layout.)
> > > 
> > > To turn a char string into a byte string, you can use
> > > scheme_char_string_to_byte_string() or
> > > scheme_char_string_to_byte_string_locale(), depending on whether you
> > > want a UTF-8 or locale-based encoding of the string.
> > 
> > Thanks for the explanation. I'm still lost.
> > I accept that mzscheme uses a multibyte representation internally,
> > and that the use of scheme strings works internally. I want to
> > extract a string from inside mzscheme and give that string
> > as a null-terminated string to a C function. That function
> > could be (and is) sqlite_open() or ot could be fopen() for
> > opening a file for custom processing, etc.
> > 
> > How do I extract the string from mzscheme to give to the C function?
> 
> The internal representation is UCS-4: it's a Unicode encoding that
> uses 4 bytes for each character.  Most C library functions will expect
> a simple NUL-terminated string, either plain ASCII or using an
> encoding like UTF-8 or something based on your locale.  You need to
> somehow convert Scheme strings into a sequence of NUL-terminated
> bytes.  There are three options for that:
> 
> 1. MzScheme has a `bytes' (or byte strings) type which is similar to
>    plain C strings.  This is used in places where you want a simple
>    sequence of NUL-terminated characters -- it corresponds to a C
>    `char*'.  The syntax for these things on the Scheme side is #"blah
>    blah".  If you use these things from Scheme, then on the C side you
>    should use SCHEME_BYTE_STRINGP to test for these values and
>    SCHEME_BYTE_STR_VAL to extract the contained (NUL-terminated)
>    char*.
> 
> 2. If you don't want to deal with byte-strings on the Scheme side, you
>    can provide a Scheme interface that will do the conversion.  For
>    example, you implement a `foo-bytes' function in C, then you write
>    a Scheme wrapper function that looks like:
> 
>      (define (foo str)
>        (foo-bytes (string->bytes/utf-8 str)))
> 
>    This wrapper will convert the string to a UTF-8 encoded bytes.
>    There is also `string->bytes/locale' and `string->bytes/latin-1'
>    for other encodings, if you only deal with ASCII they will all do
>    the same.
> 
> 3. If you want to do this conversion in C so you never deal with byte
>    strings in Scheme, then you should use the C functions that Matthew
>    pointed at: `scheme_char_string_to_byte_string' or
>    `scheme_char_string_to_byte_string_locale' which will convert a
>    Scheme string to a NUL-terminated char* -- they correspond to the
>    `bytes->string/utf-8' and `bytes->string/locale'.

Thanks for the even more detailed explanation. I finally am able to
put some code together to get what I was looking for. This is
my fault in understanding and no fault at all of the explanations.

What I finally have is:

Scheme_Object **argv;
char *dbname, *msg;
Scheme_Object *so;

so = scheme_char_string_to_byte_string(argv[0]);
dbname = SCHEME_BYTE_STR_VAL(so);
(void) fprintf(stderr, "dbname='%s'\n", dbname);
(void) fflush(stdout);
sqlite_open(dbname, 0, &msg);

Thanks everyone for all the help. Now on to my next problem. :)

Mike



Posted on the users mailing list.