[racket-dev] pr 12683 and using something like text:nbsp->space?
I think this is the kind of mixin that belongs in the framework and, if you
don't want it, you don't mix it in.
The preferences dialog additions could be DrRacket-specific, tho.
Also, option 2 should probably have a button in the dialog that adjusts the
preference to one of the two "silent" modes.
Robby
On Thursday, April 19, 2012, Eli Barzilay wrote:
> An hour ago, Danny Yoo wrote:
> > On Thu, Apr 12, 2012 at 5:26 PM, Robby Findler
> > <robby at eecs.northwestern.edu <javascript:;>> wrote:
> > > Yes, normalization doesn't deal with those spaces. It does change
> > > the text in ways that are unfriendly and I often tell DrRacket
> > > "no" when it asks about normalization. I just wanted to put that
> > > into the mix for this conversation, since it is a place that has
> > > to deal with similar issues.
> >
> > I propose a backtrack my current patch, and instead to do the
> > following:
> >
> > ---
> >
> > * Add a set of choices in the editor Preferences pane, with the
> > following options:
> >
> > Treatment of Unicode zero-width characters (such as zero-width
> spaces):
> >
> > 1. Preserve them.
> > 2. When introduced, prompt a dialog choice to delete them.
> > 3. Automatically delete them.
> >
> > with the default preference to be option 2.
>
> I see some problems here that need to be addressed.
>
> The first problem is the definition of "zero-width characters": some
> of these are not problematic -- for example, #\u05B0 is something that
> gets added to a letter so it doesn't have its own width. OTOH, there
> are many other sources of confusion that are not at all related to
> width, like #\u0392 which is usually even displayed using the same "B"
> character so there's no visual difference.
>
> The second problem is the thir option offering to just delete them.
> Since I view a "proper" solution as something that can deal with all
> of these problems, plain deletion is obviously not always the right
> solution.
>
> The third problem is something that I already mentioned: even if both
> of the above points are addressed, what if I choose #3 because it
> seems like an easy way to avoid such problems, and later I get bitten
> when I paste some text with an intention of keeping these things in?
> There's no way to avoid it by saying that it's only a few people who
> would run into these things -- since these people are exactly the kind
> of people who are likely to suffer these results. (IOW, if I deal
> with weird texts, I'm likely to get nagged a lot and choose #3, and
> I'm also likely to want these things in strings.)
>
> So I think that this should be revised as follows:
>
> 1. Drop the whole "zero-width", and instead just use something that
> indicates "potentially confusing". (I'm surprised that this thread
> keeps focusing on just zero-width spaces.)
>
> 2. Change #2 to some form of "normalization". (That's a bad term
> since it has a specific sense, but I'm sure that there's some term
> somewhere for these kind of changes.)
>
> 3. Remove option #3.
>
> Alternatively: add a display mode that "spells out" all of the fishy
> characters, as done in Emacs when you open a file in literal mode.
>
>
> > * Collect the set of zero-width characters. Zero-width spaces, of
> > course, but also see what other Unicode characters exhibit similar
> > weird behavior.
>
> (I completely agree with this -- the list of these things will grow;
> only not restricted to zero-width-ness.)
>
> --
> ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
> http://barzilay.org/ Maze is Life!
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.racket-lang.org/dev/archive/attachments/20120419/5e43d636/attachment-0001.html>