tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [PATCH] replace 0xA0 to whitespace in plain text files (part 2)



On Thu, Sep 11, 2008 David Laight wrote:

That sucks big-time.
It makes me think even more that UTF-8 is completely inappropriate
for a system-wide locale on any unix system.
Clearly some documents and strings can be in UTF-8, but that has to
be a known property of the string.  It isn't appropriate that
any string a program obtains can be assumed to be UTF-8.
But at least, we could make the UTF-8 encoding explicit by including
the BOM (byte order mark) at the beginning of such a file.It is the
byte sequence 0xEF 0xBB 0xBF.

Vim has support for automatically handling it, see e.g.

http://www.nabble.com/utf8-BOM-td16427974.html

UTF-8 should IMO not be the default encoding (in the absence
of an explicit marker), we better stay at latin1.

Joachim


Home | Main Index | Thread Index | Old Index