[racket] Unicode character name

From: Danny Yoo (dyoo at hashcollision.org)
Date: Wed May 15 15:18:32 EDT 2013

It should not be difficult to do this by hand, by taking the contents
of the Unicode database:

    http://www.unicode.org/Public/UNIDATA/UnicodeData.txt

to help define the function that takes unicode characters and returns
the descriptive name.

It appears to be a semicolon-separated list of lines.  The first
column seems to be the code, and the second column seems to be the
name you're looking for.


Given that, here is an example implementation of code that defines a
function to map between the codes and their names:

    https://gist.github.com/dyoo/5586470

This implementation may be trying to be a little too clever: it does
the work of parsing the UnicodeData.txt file at compile time in an
effort to cache the result to bytecode.  If you use 'raco make' on
this module,

    http://docs.racket-lang.org/raco/make.html

then all subsequent uses of this module can reuse that compile-time
work, so that we only parse that file just once and for all.


Here's what I see when I try it out:

    > (require "unicode-name.rkt")
    > (unicode-name #x0907)
    "DEVANAGARI LETTER I"


Good luck!

Posted on the users mailing list.