[plt-dev] symbol->string and mutability

From: Robby Findler (robby at eecs.northwestern.edu)
Date: Thu Jun 18 07:50:23 EDT 2009

Just curious, but why the different representations? Is it because you
don't need to be able to index into a symbol and thus utf-8's
(usually) more compact representation is a win but for strings, where
you do need to index into it, a simple computation (and avoiding
searching?) makes UTF-32 the right choice?

Robby

On Thu, Jun 18, 2009 at 2:35 AM, Matthew Flatt<mflatt at cs.utah.edu> wrote:
> At Wed, 17 Jun 2009 20:28:10 -0400, Carl Eastlund wrote:
>> Why do symbol->string and keyword->string produce mutable strings?  In
>> so doing, they have to allocate a new string every time.  Is there any
>> way to get at an immutable string that is not allocated more than
>> once?  I would prefer that this be the default behavior; R6RS already
>> specifies that symbol->string produces an immutable string, for
>> instance.
>
> Symbols and keywords are represented internally in UTF-8, while strings
> are represented internally as UTF-32. So, there's not an obvious way to
> have `symbol->string' avoid allocation, except by either caching a
> string reference in the symbol (probably not worth the extra space,
> since most symbols are never converted) or keeping a symbol-to-string
> mapping in a hash table (which any programmer can do externally).
>
> I think it would be a good idea to switch to an immutable-string
> result, but considering potential incompatibility, it has never seemed
> worthwhile in the short run.
>
> _________________________________________________
>  For list-related administrative tasks:
>  http://list.cs.brown.edu/mailman/listinfo/plt-dev
>


Posted on the dev mailing list.