[plt-dev] symbol->string and mutability

From: Matthew Flatt (mflatt at cs.utah.edu)
Date: Thu Jun 18 08:30:47 EDT 2009

Yes, exactly.

At Thu, 18 Jun 2009 06:50:23 -0500, Robby Findler wrote:
> Just curious, but why the different representations? Is it because you
> don't need to be able to index into a symbol and thus utf-8's
> (usually) more compact representation is a win but for strings, where
> you do need to index into it, a simple computation (and avoiding
> searching?) makes UTF-32 the right choice?
> 
> Robby
> 
> On Thu, Jun 18, 2009 at 2:35 AM, Matthew Flatt<mflatt at cs.utah.edu> wrote:
> > At Wed, 17 Jun 2009 20:28:10 -0400, Carl Eastlund wrote:
> >> Why do symbol->string and keyword->string produce mutable strings?  In
> >> so doing, they have to allocate a new string every time.  Is there any
> >> way to get at an immutable string that is not allocated more than
> >> once?  I would prefer that this be the default behavior; R6RS already
> >> specifies that symbol->string produces an immutable string, for
> >> instance.
> >
> > Symbols and keywords are represented internally in UTF-8, while strings
> > are represented internally as UTF-32. So, there's not an obvious way to
> > have `symbol->string' avoid allocation, except by either caching a
> > string reference in the symbol (probably not worth the extra space,
> > since most symbols are never converted) or keeping a symbol-to-string
> > mapping in a hash table (which any programmer can do externally).
> >
> > I think it would be a good idea to switch to an immutable-string
> > result, but considering potential incompatibility, it has never seemed
> > worthwhile in the short run.
> >
> > _________________________________________________
> >  For list-related administrative tasks:
> >  http://list.cs.brown.edu/mailman/listinfo/plt-dev
> >


Posted on the dev mailing list.