Quarter Life Crisis

The world according to Sven-S. Porst

« Perhaps it’s just the ageMainFlickr the Censor »

Character Entities

311 words

Even after my fair share – or even more than my fair share – of using them, the names of some HTML character entities still irritate me.

â

gives â because circ is the abbreviation for circumflex rather than circle. Because the circle in å actually is a ring, I guess.

ï

suggests that we are dealing with an i-umlaut which lets us easily deduce that the encoded character will look like this: ï. Just that technically there is no i-umlaut and we are dealing with a diaeresis here.

While I can appreciate the convenience of this, my inner TEX user is still tempted to call ï a dotless i with diaeresis. In fact, the cleverness of Unicode doesn’t make it particularly easy to get an accumulation of dots like this on top of an i: ı̇̈

This may display incorrectly in browsers* which kind of spoils the surprise that to get it you need to start with a dotless i and then add the dot and diaeresis accents, rather than just being able to add the diaeresis accent to the usual i – which would simply give ï.

* This is a very polite way of saying that most Mac and Windows browsers only display boxes for the accents there: ı̇̈. IE5/Mac does slightly better by displaying actual accents in the wrong place and Firefox/Linux wins by displaying all three dots at the same hight slightly above but slightly to the left of the ı. To a certain extent this is a font issue: Georgia doesn’t contain combining accents which means the lack of Georgia works to Linux’ advantage here. Using a font like Lucida Grande instead will give better rendering on the Mac. I did some fiddling along those lines to make the display tolerable above. But the dots still all appear at the same height.

June 19, 2007, 1:34

Tagged as arial.

Add your comment

« Perhaps it's just the ageMainFlickr the Censor »

Comments on

Photos

Categories

Me

This page

Out & About

pinboard Links

People

Ego-Linking