311 words
Even after my fair share – or even more than my fair share – of using them, the names of some HTML character entities still irritate me.
â
gives â because circ is the abbreviation for circumflex rather than circle. Because the circle in å actually is a ring, I guess.
ï
suggests that we are dealing with an i-umlaut which lets us easily deduce that the encoded character will look like this: ï. Just that technically there is no i-umlaut and we are dealing with a diaeresis here.
While I can appreciate the convenience of this, my inner TEX user is still tempted to call ï a dotless i with diaeresis. In fact, the cleverness of Unicode doesn’t make it particularly easy to get an accumulation of dots like this on top of an i: ı̇̈
This may display incorrectly in browsers* which kind of spoils the surprise that to get it you need to start with a dotless i and then add the dot and diaeresis accents, rather than just being able to add the diaeresis accent to the usual i – which would simply give ï.
* This is a very polite way of saying that most Mac and Windows browsers only display boxes for the accents there: ı̇̈. IE5/Mac does slightly better by displaying actual accents in the wrong place and Firefox/Linux wins by displaying all three dots at the same hight slightly above but slightly to the left of the ı. To a certain extent this is a font issue: Georgia doesn’t contain combining accents which means the lack of Georgia works to Linux’ advantage here. Using a font like Lucida Grande instead will give better rendering on the Mac. I did some fiddling along those lines to make the display tolerable above. But the dots still all appear at the same height.