tech-userlevel: Re: utf-8 and userland

Subject: Re: utf-8 and userland
To: Wolfgang S. Rupprecht <wolfgang+gnus20040312T095618@dailyplanet.dontspam.wsrcc.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-userlevel
Date: 03/12/2004 12:07:19

--Qrgsu6vtpU/OV/zm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Mar 12, 2004 at 10:44:04AM -0800, Wolfgang S. Rupprecht wrote:
>=20
> The fly in the ointment is that some programs mash the high bits or
> otherwise sensor certain bytes.  Most notably ls(1) has a routine
> called safe_print() that is anything but safe for UTF-8.  Is this just
> a hold-over that can be switched off (or at least turned down a bit)
> when the LC_LANG is UTF-8?  I'm willing to submit patches if it will
> move things along.  I just don't want to bother if nobody wants it.

I think that'd be cool. Though I'm not sure if LC_LANG is the right place=
=20
to look. Wouldn't nl_langinfo(CODESET) be the right thing to look at?

So I think it's cool, but I don't think UTF-8 an LC_LANG value; it's a=20
qualifier of how the locale stores characters. Bashing ls to do the right=
=20
thing here will be good.

Let me repeat myself. I think working UTF-8 locales will be cool. :-)

Take care,

Bill

--Qrgsu6vtpU/OV/zm
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAUhh3Wz+3JHUci9cRAv1+AKCVQrxfr252EykNLDDT9Tycm1YrmACbB65R
9VctNUAoqjOFQ2K//6iG3Ws=
=86PE
-----END PGP SIGNATURE-----

--Qrgsu6vtpU/OV/zm--