NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: standards/47454: terminfo(5) does not have a capability for terminal/display character set

Unix has to know what your terminal can do, a priori, for those programs which
attempt to manipulate it in any way (e.g. vi, emacs, clear, less; i.e. anything
linked with terminfo(3) or curses(3), hell any program that #includes
<termios.h> or uses the TIOC* ioctl(2) system calls)) to succeed. The failure
mode caused by a mismatch between what Unix thinks your terminal is or can do
from the TERM environment variable (sometimes set from /etc/ttys or provided
by remote login programs like ssh) is old and well known/understood: "this
doesn't look right."

This follows to character set display capability. We're lucky in that ASCII
is the base assumption of Unix, and that ASCII is also a proper subset of a
large number of character sets (e.g. ISO-8859-1, ISO-2022-JP, UTF-8). You're
really going to lose very badly if the character set your terminal uses does
not have ASCII as a subset - given how common ASCII is, *everything* has to
be converted (e.g. run through iconv(1)) before display, i.e. you very probably
can't just "cat a file" [to the tty] unless that file is in your terminal's
character set.

The implication for terminals described by terminfo which have downloadable
fonts is that there will have to be terminal names that are a tuple of what
it is and the current character set (e.g. "vt200-koi8-r"), and every time a
different character set is downloaded, the TERM environment variable must
change for programs to be able to do the right thing. You're still stuck with
this situation now: you still have to change the LANG environment variable
when the terminal character set is changed.

What I'm trying to argue is that character set is a capability or
characteristic of the terminal (interface) one uses to Unix, and therefore
terminfo (or termcap) is the database in which we describe such things.

Semantically, LANG is similar but not the same, in that its intention is to
describe (in part) what language, and with the other locale variables, what
cultural assumptions you have (e.g. sort(1) ordering of characters, commas
instead of periods for denoting the end of the integer part of a number and
the beginning of the decimal fraction, ordering the components of a date).

        Erik <>

Home | Main Index | Thread Index | Old Index