Subject: Re: character sets in XML documents
To: None <netbsd-docs@netbsd.org>
From: James K. Lowden <jklowden@schemamania.org>
List: netbsd-docs
Date: 01/21/2006 12:55:53
Rui Paulo wrote:
Emil Hessman <ceh@otaku.se> writes:
> >    On another note, has there been any discussions at all regarding
> >    use
> > of UTF-8 as a prefered character encoding for *all* documentation?
> 
> Not that I recall. But this could be considered a bloat, I think.

I should hope not.  Unicode is the default encoding for XML and even HTML
these days.  UTF-8 has a byte density very close to ISO 8859-1 for
languages that use Roman characters.  Moving everything (even man pages?)
to UTF-8 would be the Right Thing (tm).  

If document authors prefer to operate in other encodings for their
convenience, why not let them convert with iconv(1), edit, re-convert, and
commit?  Eventually UTF-8 will be better supported than its obsolete
predecessors, so the road will only get easier.  

Is there some technical obstacle I'm overlooking?  

--jkl