Quarter Life Crisis

The world according to Sven-S. Porst

« I WAS LIKE… WHUT?! AND SHE WAS LIKE WTF?! MainInsult to Injury »

UnicodeChecker 1.14

813 words on

UnicodeChecker Icon It has been quite a while since the previous UnicodeChecker update, a bit over two years in fact. Thus, people may have been left with the impression that the application is dead by now. That, however, would be quite far from the truth. The truth, according to earthlingsoft, is that UnicodeChecker did its job - or rather all of its many jobs - just fine all the time, thank you. We were tempted to release a new version when Unicode 5 was published, but bumping our version number just for including a bunch of new files from a third party seemed like cheating. In fact, bullying all our users into an update when the few of them who can really tell the difference between Unicode versions can easily download those data files and put them into UnicodeChecker’s Application Support subfolder would seem a little rude.

Today things change. Unicode 5.2 was published at the beginning of the month and the structure of its data files for Unihan data saw a major change. Hence UnicodeChecker had to be updated to work with that and the new version 1.14 is published today.

Unihan Loading Progress Indicator The updated Unihan data file format is a royal pain as it is is now delivered in a bunch of files instead of a single one which makes gathering the information much harder. To keep the a reasonably short launch time for the people who use Unihan, UnicodeChecker now reads those files in the background after launching. Which means that the application will be available right away but not provide Unihan information in the first few seconds after launching. And if you look carefully, you can watch a little progress bar as the loading progresses. The bad news is that this totally destroys the very clever and efficient scheme UnicodeChecker used previously which let it use Unihan without storing all of the data in memory. As a consequence memory consumption increases by an obscene amount (about 100MB) because of that.

While the Unihan file still isn’t part of UnicodeChecker due to its size, the Unicode 5.2 data files are included. There are plenty of additions in this revision. They even introduced additional Snowman codepoints! Black Snowman (U+26C7) and Snowman without Snow (U+26C4) OMG ☃☃☃11!!!!!1☃☃☃11! Now I’ll just need a font with glyphs for these new codepoints.

New Snowman codepoints

There are also new features: The first is the new Length Utility. It tells you the number of codepoints and bytes of a string in various Unicode encodings. Due to the various forms Unicode can take those numbers are interesting at times:

UnicodeChecker Length Utility

The other new feature is the addition of a QuickLook plug-in to UnicodeChecker. If you have the UnicodeChecker Spotlight support set up (command in the File menu), the QuickLook previews will let you peek at the glyphs you found without needing to go into UnicodeChecker for each of them:

QuickLook display of Spotlight Search for Unicode Characters

And there are plenty of additional details: You can now conveniently use drag and drop in the Split Up Utility to rearrange the characters in a string (handy when dealing with composed characters and wanting to shift an accent around, say); Users of the Escape Utility may be delighted that it now lets them escape the ‘standard’ characters (e.g. a-z) as well; the submenus in the Character Blocks menu have been rearranged to accommodate all the added entries without scrolling on small screens; and if you have the slightest idea what Adobe’s Glyph List for New Fonts is, you may be delighted to work with the latest version of that.

There are also a few things this version doesn’t do. To begin with it’s not a 64bit application. Say thank you to our users who wanted Growl support for that. As UnicodeChecker is made to run on Mac OS X.4 (well, theoretically, it should even run on X.3, but we can neither test nor care too much) and the 64bit version of Growl doesn’t support that, we couldn’t have both.

I am sure that one could in principle find some terribly smart way of setting different versions up in the same application, but the reason one uses readymade frameworks is that one can be lazy and not smart; besides, if you have ever used XCode you’ll know that any attempt to do anything even slightly clever will immediately backfire and turn into hours and hours wasted to make the setup ‘just right’.

As this update is needed to read the current Unihan data, the choice we made was to stick to 32bit for the time being, so X.4 users aren’t left out. As most people probably still run 32bit applications on their X.6 systems anyway at this time, the difference should be hardly noticeable anyway. As they are independent from the main executable, the Spotlight and Quick Look plug-ins are 64bit, by the way.

UnicodeChecker can’t make coffee yet, either. A real shame.

October 12, 2009, 0:58

Tagged as earthlingsoft, software, unicode, UnicodeChecker.


Comment by Will Robertson: User icon

UnicodeChecker is my current favourite application. Absolutely essential for my work in unicode maths for LaTeX. Did I remember to donate a while back? (If not, I’m sorry! I’ll make it up to you when, er, I’m not broke any more.)

Thanks for all the hard work.

October 12, 2009, 3:34

Add your comment

« I WAS LIKE... WHUT?! AND SHE WAS LIKE WTF?! MainInsult to Injury »

Comments on




This page

Out & About

pinboard Links


Received data seems to be invalid. The wanted file does probably not exist or the guys at last.fm changed something.