[plt-scheme] strings?
On Sat, 16 Apr 2005, Eli Barzilay might have said:
> On Apr 16, Mike wrote:
> > On Fri, 15 Apr 2005, Matthew Flatt might have said:
> >
> > > SCHEME_BYTE_STR_VAL() should be used only on a value for which
> > > SCHEME_BYTE_STRINGP() produces true.
> > >
> > > SCHEME_CHAR_STR_VAL() should be used only on a value for which
> > > SCHEME_CHAR_STRINGP() produces true.
> > >
> > > The result of SCHEME_CHAR_STR_VAL() is a mzchar*, so I can see why
> > > you're trying to use SCHEME_BYTE_STR_VAL() to get a char*, but it
> > > doesn't work. (The actual layout of byte string and char string values
> > > means that, on a little-endian machine, you end up with one byte when
> > > trying to use a char string as a byte string. But that's just an
> > > artifact of the current data layout.)
> > >
> > > To turn a char string into a byte string, you can use
> > > scheme_char_string_to_byte_string() or
> > > scheme_char_string_to_byte_string_locale(), depending on whether you
> > > want a UTF-8 or locale-based encoding of the string.
> >
> > Thanks for the explanation. I'm still lost.
> > I accept that mzscheme uses a multibyte representation internally,
> > and that the use of scheme strings works internally. I want to
> > extract a string from inside mzscheme and give that string
> > as a null-terminated string to a C function. That function
> > could be (and is) sqlite_open() or ot could be fopen() for
> > opening a file for custom processing, etc.
> >
> > How do I extract the string from mzscheme to give to the C function?
>
> The internal representation is UCS-4: it's a Unicode encoding that
> uses 4 bytes for each character. Most C library functions will expect
> a simple NUL-terminated string, either plain ASCII or using an
> encoding like UTF-8 or something based on your locale. You need to
> somehow convert Scheme strings into a sequence of NUL-terminated
> bytes. There are three options for that:
>
> 1. MzScheme has a `bytes' (or byte strings) type which is similar to
> plain C strings. This is used in places where you want a simple
> sequence of NUL-terminated characters -- it corresponds to a C
> `char*'. The syntax for these things on the Scheme side is #"blah
> blah". If you use these things from Scheme, then on the C side you
> should use SCHEME_BYTE_STRINGP to test for these values and
> SCHEME_BYTE_STR_VAL to extract the contained (NUL-terminated)
> char*.
>
> 2. If you don't want to deal with byte-strings on the Scheme side, you
> can provide a Scheme interface that will do the conversion. For
> example, you implement a `foo-bytes' function in C, then you write
> a Scheme wrapper function that looks like:
>
> (define (foo str)
> (foo-bytes (string->bytes/utf-8 str)))
>
> This wrapper will convert the string to a UTF-8 encoded bytes.
> There is also `string->bytes/locale' and `string->bytes/latin-1'
> for other encodings, if you only deal with ASCII they will all do
> the same.
>
> 3. If you want to do this conversion in C so you never deal with byte
> strings in Scheme, then you should use the C functions that Matthew
> pointed at: `scheme_char_string_to_byte_string' or
> `scheme_char_string_to_byte_string_locale' which will convert a
> Scheme string to a NUL-terminated char* -- they correspond to the
> `bytes->string/utf-8' and `bytes->string/locale'.
Thanks for the even more detailed explanation. I finally am able to
put some code together to get what I was looking for. This is
my fault in understanding and no fault at all of the explanations.
What I finally have is:
Scheme_Object **argv;
char *dbname, *msg;
Scheme_Object *so;
so = scheme_char_string_to_byte_string(argv[0]);
dbname = SCHEME_BYTE_STR_VAL(so);
(void) fprintf(stderr, "dbname='%s'\n", dbname);
(void) fflush(stdout);
sqlite_open(dbname, 0, &msg);
Thanks everyone for all the help. Now on to my next problem. :)
Mike