Subject: Re: utf-8 and userland
To: Wolfgang S. Rupprecht <>
From: Bill Studenmund <>
List: tech-userlevel
Date: 03/12/2004 12:07:19
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Mar 12, 2004 at 10:44:04AM -0800, Wolfgang S. Rupprecht wrote:
> The fly in the ointment is that some programs mash the high bits or
> otherwise sensor certain bytes.  Most notably ls(1) has a routine
> called safe_print() that is anything but safe for UTF-8.  Is this just
> a hold-over that can be switched off (or at least turned down a bit)
> when the LC_LANG is UTF-8?  I'm willing to submit patches if it will
> move things along.  I just don't want to bother if nobody wants it.

I think that'd be cool. Though I'm not sure if LC_LANG is the right place=
to look. Wouldn't nl_langinfo(CODESET) be the right thing to look at?

So I think it's cool, but I don't think UTF-8 an LC_LANG value; it's a=20
qualifier of how the locale stores characters. Bashing ls to do the right=
thing here will be good.

Let me repeat myself. I think working UTF-8 locales will be cool. :-)

Take care,


Content-Type: application/pgp-signature
Content-Disposition: inline

Version: GnuPG v1.2.3 (NetBSD)