tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: curses vs non-ASCII



On Wed, Nov 18, 2015 at 11:01:19PM -0500, Mouse wrote:
> >> I'm writing a program with a curses(3) interface, and I'm finding
> >> that non-ASCII octets in strings are getting completely lost [...]
> 
> > Have you called setlocale(3) appropiately?
> 
> No.  I was not calling setlocale() at all.  None of the documentation I
> found gave me reason to think it would make any difference.

In that case, it was using the "C" standard locale, which is using ASCII
only, no surprise. This is not about multibyte locales even. You have
quoted the relevant part of the mbrtowc man page even...

> > E.g. setlocale(LC_CTYPE, "") to pick up the setting from the
> > environment.
> 
> That helps somewhat.  It doesn't itself fix things, but combined with
> setting LC_CTYPE, LANG, or LC_ALL to en_CA.ISO8859-1 in the
> environment, octets that are 8859-1 printables get through - or, more
> precisely, a few 8859-1 printables do; I assume the rest would too.  (I
> haven't tested whether it gives me the rest of what I want, which is
> the 0x80-0x9f octets also being treated as single-octet printables.)

But they are not printable in ISO 8859-1, they are control characters.

> Are these two things - (a) that setlocale() has to be called for the
> environment to be recognized and (b) that "" is magic to make it pick
> up the environment - documented anywhere?  The closest I see is a line
> in locale(3) that says that "" `denotes the native environment', but
> without any description of what that means; I'm wondering if I've just
> missed something.

As I said, setlocale() is how you specify that you want a locale other
than the builtin "C" locale. The empty string says "pick from the
environment", a non-empty string can be used to explicitly specify a
locale directly.

> Is there any documentation on what strings can be put in $LANG et al?
> I guessed en_CA.ISO8859-1 based on the da_DK.ISO8859-1 example in
> nls(7), but doing that seems...suboptimal.

The nearest thing for the default configuration is to ask for LOCALES in
src/share/locale in combination with locale.alias.

> Is there any documentation for someone wanting to create a locale?  It
> seems likely to me that there aren't any existing locales that consider
> all of 0x20-0xff as printable, in which case the least-pain option may
> be to create a locale of my own.  (The latest attempt at building a
> system without these headaches produced _compile_ errors in vi.  I
> can't help wondering why the knobs even exist if attempts to use them
> explode this badly.)

Most locales have at least 0x7f as control character...

Joerg


Home | Main Index | Thread Index | Old Index