[plt-scheme] immutable strings vs. uninterned symbols
On Jun 6, Doug Orleans wrote:
> Matthew Flatt writes:
> > At Tue, 6 Jun 2006 08:52:07 -0400, Doug Orleans wrote:
> > > What's the difference between immutable strings and uninterned
> > > symbols?
> >
> > Besides the printing and reading conventions, immutable strings support
> > `string-ref' to access individual characters.
> >
> > Matthias points out that strings support `string-append', too.
>
> As Carl pointed out, this is just a matter of library support.
>
> You can easily make symbol-ref and symbol-append. But as you both
> point out, the real question is performance: string-ref is constant
> in time and space, but I think symbol-ref is linear in both, since
> symbol->string has to copy the whole string. (Would it be possible to
> make a symbol->immutable-string that was constant time?)
Much more, IMO. It's the concept of a different type for different
uses. Otherwise you would feel just as well in a world that uses
numbers/strings/church-encodings/goedel-numbers for everything.
Sure you have enough conversion back doors that you can write a `+'
that adds two strings that represent numbers -- but do you want to?
The fact that you need to do some extra work is like a little red
light bulb telling you that something is wrong.
In the same way you can write your own symbol-append, symbol-ref,
symbol->number, symbol<?, subsymbol, regexp-symbol-match,
open-output-symbol, read-symbol-avail!, etc, then double the whole
thing for uninterned-symbols. But the *real* real question is do you
want to? (Not performance at all -- the question stands even if
symbol->string is an O(1) operation.)
There are certain features of numbers and strings that makes them a
useful representation of numbers and strings. In the case of symbols,
it's the lack of features that makes it is. Every time you use a
symbol function from the above imaginary library, it means that you
probably should just use strings...
Sometime in the last year I had an exam question to turn a list of 'a
and 'd symbols indo a c---r function -- some people tried to
concatenate the symbol 'c, the list, and the symbol 'r, and use the
result as a function. I later verified that these people had problems
distinguishing values and names, and that was made worse by an
illusion that a symbol is a kind of a string. My guess is that using
strings instead would have made it much more confusing.
--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://www.barzilay.org/ Maze is Life!