[racket] get-resource question

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Mon Dec 24 08:03:22 EST 2012

This conversion function is almost what you want:

 ; convert : string -> string
 (define (convert s)
   (define c (bytes-open-converter "platform-UTF-16" "platform-UTF-8"))
   (define-values (new-s len state) (bytes-convert c (string->bytes/utf-8 s)))
   (unless (eq? state 'complete) (error "didn't decode all"))
   (bytes->string/utf-8 new-s))

Really, though, things have already gone wrong when `get-resource'
produces a string instead of bytes. To produce a string, `get-resource'
is decoding UTF-16 as UTF-8, and then `convert' above is re-encoding
with UTF-8 in the hope of getting the original byte string back; that
works only as long as the string sticks to ASCII.

You should supply `#:type 'bytes' to `get-resource', and then use this
variant of `convert', which expects a byte string instead of a string:

 ; bytes->string/utf-16 : bytes -> string
 (define (bytes->string/utf-16 s)
   (define c (bytes-open-converter "platform-UTF-16" "platform-UTF-8"))
   (define-values (new-s len state) (bytes-convert c s))
   (unless (eq? state 'complete) (error "didn't decode all"))
   (bytes->string/utf-8 new-s))


At Mon, 24 Dec 2012 13:41:53 +0100, heraklea at gmx.de wrote:
> Hello friends,
> 
> getting resource from registry with the type REG_EXPAND_SZ gives me a unicode 
> string like this:
> "\"\u0000D\u0000:\u0000\\\u0000d\u0000e\u0000v\u00005\u00005\u0000\\\u0000p\u000
> 0r\u0000o\u0000j\u0000e\u0000c\u0000t\u0000s\u0000\\\u0000l\u0000i\u0000b\u00006
> \u00004\u0000\"\u0000\u0000\u0000"
> 
> (define erg (get-resource.....).
> 
> To convert to a readable string I use 
> (printf "~a" (string-replace erg "\u0000" ""))
> 
> Is there a more convinient function to tranform this unicode string which are 
> formatted with an empty space between every character to a normal readable 
> string??
> 
> 
> yours,
> ____________________
>   Racket Users list:
>   http://lists.racket-lang.org/users

Posted on the users mailing list.