[plt-scheme] Unicode on the cheap
Or possibly (string ...) could/should be extended to simply accept arbitrary
scheme expressions, and produce strings by concatenating their string
equivalents; i.e.
(define s (string 'a "bc" (make-N-chinese-chars 3) 1 "2" (+ 1 2)))
s -> "abc???123"
Whereby producing strings of more complex characters may be thought of as
concatenating complex characters each themselves composed of UTF-8 strings.
And where:
(string-length s) -> 9 ; length in logical characters (Unicode code-points)
(string-UTF-8 s) -> 15 ; length in physical UTF-8 code-units (bytes)
; or maybe 16 if including the terminal null marker.
-paul-
At Sun, 25 Jan 2004, Matthew Flatt wrote:
> At Sun, 25 Jan 2004 08:35:29 -0600, Robby Findler wrote:
>> If I were to put four of those chinese characters into string (eg by
>> calling `string' with four arguments), why wouldn't the resulting
>> string have a `string-length' of four?
>
> If you create a string by calling `string' with four arguments, then
> `string-length' reports 4. Each of the four arguments to `string' is a
> "char" (and therefore a code unit or #\377 or #\376).
>
> But you can't create a string containing four Chinese characters by
> calling `string' with four arguments, because each Chinese character
> requires three "char"s.
>
> Matthew