[plt-scheme] Re: case-sensitive reader by default

From: Eli Barzilay (eli at barzilay.org)
Date: Tue Apr 27 22:02:05 EDT 2004

On Apr 27, Felix Klock's PLT scheme proxy wrote:
> This made me think...
> 
> 1. I wrote down why I thought case-insensitivity was useful for my own 
> idiosyncratic purposes when programming in Scheme.

I must mention the fact that I started like that in the past, using
this style (in CL):

  (defun FOO (x y z)
    ...)
  
  (defun BAR (x y z)
    ... (foo ...) ...)

I now find this style disgusting.  I find that even with a
case-insensitive language I will be strict about the case of my
identifiers, which means that the only thing I lose is the ability to
use other cases for internal keywords -- but again, I will never use
`LAMBDA' given that the symbol is really `lambda'.


> Could someone (perhaps Eli?) volunteer to tell me why it is
> important to make the default case-sensitive?

* It makes it easy to interact with the world.  With the FFI thing it
  is *much* easier for me to follow the same name convention in an API
  than to go through some Schemeification (even an automatic one).

* The same holds for other cases of interaction with a case sensitive
  world, for example shell programming.  BTW, I'm not at all
  advocating StupidIdentifierStyle over much-better-style, just making
  it easier to use NativeNameStyle instead of hacking that into
  some-bad-approximation.

* After thinking about these things for a long while, I tend to think
  now that it is irrational to stick to some arbitrary character
  equivalence relation.  Unicode is only making this point stronger.

* And I don't need to go to things I don't know about like that double
  s thing or that obscure Turkish feature.  I can stay with Hebrew and
  get plenty of questions -- some letters have different versions for
  when they appear as a suffix characterr.  Should that be considered
  the same as the non-suffix version?  What about the vowel dots -- if
  you remove them, should the result be the same?  What about the
  special LTR marker in unicode -- is this ignorable?  Case
  sensitivity is a global solution of just avoiding all this natural
  language dirtiness.

* Another point that demonstrates this is the utterly confusing CL
  implementations deal with case-insensitivity.  (I view all as hacks
  at various ugliness levels, all aimed at helping people use case
  sensitive code, and making it possible to pretend that builtins are
  in lower case.)

* A personal habit I have: I'd prefer using this to bind keys in
  Emacs:
    (global-set-key [(control shift ?z)] 'swindle-edit-comment)
  because I prefer to think of it as marking a modifier key than
  relying on the fact that ?Z happens to be ?z used with shift.

* Modern languages, and other software, protocols etc, all tend to be
  either case sensitive, or move from case insensitivity to case
  sensitive (with case insensitive being part some past that people
  want to get away from: DOS names, HTML, Pascal).  (Note that the DOS
  days lead to similar obscurity to CL's, with weird concepts like
  "case preserving", and little surprises like "foo.bar" being the
  same as "FOO.BAR" but not "Foo.bar".)


> Especially when the three (|..|, #cs, -g) workarounds exist for
> those times when you need to interface with a case-sensitive
> environment

None of them work properly:
* |...| makes code really verbose,
* #cs works fine only when you have some global construct that you use
  like #cs(module ...),
* -g works only on Unix and only when people will execute your script
  instead of loading it themselves,
* `read-case-sensitive' works only for REPL-like stuff.


> [[and thus make it explicit with the #cs marker that the case
> sensitivity matters... oh but this is starting to sound like that
> other thread...]]

(Right, which is why I said I was afraid to risk starting another
crusade.)


> Is it so that we can write our sets as A and an element of A as a?
> (I'm being a little facetious here...)

No.  For me it's first of all a matter of practicality, and second of
all avoiding arbitrariness.

> 2. It seems to me, since any trained user will know the way to get the 
> behavior they desire out of their Scheme programming system, that
> the default behavior should be the one that is most intuitive for
> the ***UNTRAINED*** user.  So perhaps this poll is going out to the
> wrong people?

The untrained user is hard to catch.  They might be well-trained in
C/C++/Java -- chances are that this will be some case-sensitive
environment.  They might also speak a different language and find it
odd that on top of using English keywords, they are forced to admit
the fact that in this language there are two versions for every
character.


> Or perhaps the votes of PLT veterans should only count 2/3's?  (I'm
> being almost completely facetious here...)

Maybe veteran Schemers have already went through some meditation
coming up with reasons for choosing whatever they chose?

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                  http://www.barzilay.org/                 Maze is Life!


Posted on the users mailing list.