tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: FreeBSD i18n fonts for wscons



On Wed, Feb 03, 2010 at 02:39:39PM +0100, Adam Hoka wrote:
> > UTF-8 is a 7bit encoding...
> 
> Blah, I mean 8.


Blah, it is a variable-width encoding taking anything from 1 to 4 bytes per
character.

Lower-ASCII characters take 1 byte, all the rest takes more.  Eg. a Russian
text will take two bytes per character (except spaces and punctuation),
thus will consume nearly twice the amount of storage compared to KOI8-R.
But this was considered a non-issue in the design of UTF-8 (and I tend to
agree).

For my own (limited) use of upper-ISO-8859-1 characters in e-mail, the
space gain from using single byte encoding simply annihilated because
declaring "iso8859-1" in the headers takes more space than "utf-8". :-)


        Geert


-- 
Geert Hendrickx  -=-  ghen%telenet.be@localhost  -=-  PGP: 0xC4BB9E9F
This e-mail was composed using 100% recycled spam messages!


Home | Main Index | Thread Index | Old Index