Subject: Re: Return from nl_langinfo(CODESET)--any standard?
To: Dave Huang <khym@azeotrope.org>
From: Klaus Klein <kleink@reziprozitaet.de>
List: tech-userlevel
Date: 01/24/2003 23:04:41
Dave Huang <khym@azeotrope.org> writes:

> Yesterday, I ran into a program that assumes that the output of
> nl_langinfo(CODESET) is a name that (GNU) libiconv recognizes. This
> doesn't seem like an unreasonable assumption to me--I did a quick
> search through SUSv3 and didn't come up with anything that specified
> what nl_langinfo(CODESET) was supposed to return. SUSv3 does say that
> iconv_open() codeset names are implementation-defined.
> 
> So, is there any reason why our nl_langinfo() doesn't return an 
> "official" name, such as one from 
> http://www.iana.org/assignments/character-sets ?  GNU iconv supports 
> the IANA names... right now, nl_langinfo() returns "646" for an ASCII 
> locale, rather than "ASCII", "US-ASCII", "ISO646-US", or some other 
> common variant. For ISO 8859-x locales, nl_langinfo() returns 
> "ISO8859-x", rather than "ISO-8859-x".

This would be one of the "so many to choose from!" issues. :-)

While there's GNU supporting IANA names, there is also a lot of prior
art in SVR4-originated environments, which use parts of the X.Org
registry[1].  The Open Group used to maintain their own registry[2],
which is mostly oriented towards the DCE.  Furthermore, there is the
ISO Cultural Registry[3], which, at the very least, features the
attraction of localedef(1) charmap files.

I suppose some of the Citrus[4] developers on this list can shed
further light on the reasons for their choice.


- Klaus

[1] http://ftp.x.org/pub/R6.6/xc/registry
[2] ftp://ftp.opengroup.org/pub/code_set_registry
[3] http://std.dkuug.dk/cultreg/
[4] http://citrus.bsdclub.org/index.html