ICANN is my new hero


Plus, um, what? URLs can be written in non-Latin alphabets? This is HUGE!

Also, they warn applicants:

“The usability of IDNs [International Domain Names] may be limited, as not all application software is capable of working with IDNs… It is up to each application developer to decide whether or not they wish to support IDNs.”

*waves arms* Here, me! I will be an application developer and decide to support this! Plus, with the countries that were just approved (Egypt, Saudi Arabia, and UAE), it would be a good excuse to finally learn rudimentary Arabic.

Unfortunately, Wikipedia says “internationalised domain names should be converted to a suitable ASCII-based form.” Not so cool. But still, how have I not heard about this before?

Korean Alphabet Love


Reading about Korean language representation in Unicode–and wow.

The Korean alphabet is simple enough that anyone can memorize it in an afternoon. The difficult part is that when the Korean alphabet was invented, the only other writing game in town was Chinese, with its square-shaped characters. In theory, it should be easy to read Korean letters linearly–but someone decided that the letters should be arranged in aesthetically pleasing syllabic boxes. Fast forward 600 years, and now each multi-letter syllable has its own Unicode representation.

I won’t attempt to re-explain what that article covers so thoroughly, but I just had to share the best part: when you take a Unicode point (for example, 46239) that describes a single “boxed” syllable, you can deconstruct it into letter components by:

tail = (codepoint – 44032) % 28
vowel/mid = 1 + ((codepoint – 44032 – tail) % 588)/28
lead = 1 + (int) ((codepoint – 44032)/588)

With my random number I picked above, it becomes… “dwetch,” which has no part of any Korean word so far as I know. Internet search yields… 50,000 results of complete jibberish, looking remarkably like (imagine!) someone just rendered random Unicode combinations.

But still. HOT. I love this world.