[racket-dev] Gnats & UTF8

From: Eli Barzilay (eli at barzilay.org)
Date: Fri Mar 9 03:46:51 EST 2012

In preparation for a move to github, I've finished a very long and
tedious[*] scan of the complete gnats db, and everything is now
properly utf8-ized.  In addition, the web interface declares a utf8
charset which means that the texts are fine there too.  There is one
problem that is still left: incoming emails are added to the bug
history without attention to charset (or encoding, but that's not as
important).

To make a long story short: please try to use UTF-8 in your emailers.
Probably the biggest example of needing to do this is gmail, which for
some reason doesn't default to UTF-8.  (On the first settings page
there's a checkbox for "Use Unicode (UTF-8) encoding for outgoing
messages".)


([*] I thought that much of this could be automated... which extremely
naive.  There were some examples of insanely bad encoding byproducts,
in many cases I resorted to detective methods of googling names of
people, using translate, grepping for bits of texts in other bugs,
using Emacs to guess encodings, using iconv, and more.  In an extreme
case (PR8719) the file was so broken that I had to figure out the
quoted text from the string-constants file of the time and the commit
that fixed it, then figure out how it was broken and write some code
to unbreak it...  Emacs was absolutely essential for all of this.)

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

Posted on the dev mailing list.