tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: FreeBSD i18n fonts for wscons



On 30 January 2010 02:38, Valeriy E. Ushakov <uwe%stderr.spb.ru@localhost> 
wrote:
> Matthias Drochner <M.Drochner%fz-juelich.de@localhost> wrote:
>
>>> wsemul_vt100 is our primary emulation. ÂDo we want to optimize our
>>> primary emulation for the scenario where it is used to display output
>>> from programs/devices that use ASCII (or NCR) plus DEC technical and
>>> DEC supplemental graphics charsets or do we want to optimize it for
>>> people who use charsets other than latin1 and use termcap/curses apps?
>>
>> It is the primary emulation because it is the most powerful atm,
>> but that's not set in stone. It is easy to add another one (it has
>> taken me one or two hours to derive "ucons" from "vt100"), and the
>> rest is some changes to etc/wscons.conf.
>> So I'd prefer to leave "vt100" as an emulation which is conservative
>> in its features, but works if one logs into a legacy/embedded/3rtparty
>> system from it, no matter which fonts are loaded locally.
>> For modern uses, there are plenty of names available, just not "vt100".
>
> So if we were to clone wsemul_vt100 under a different name and fix
> G2/G3 handling - how is that different from fixing it directly in
> wsemul_vt100 without cloning it? ÂThere's still the mapping issue to
> address regardless of whether we want to fix it in the old code or in
> the cloned code.

I guess it's better to have known working code and experimental code
side by side.

Sure if the resulting emulation is compatible with with the current
vt100 the changes should be merged back, and if it isn't then it is a
reason for some serious thinking why it is so.

>
>
>>> and given the unicode semantic of the second argument to mapchar (as
>>> used for fun and profit by the two DEC charset above) *this is why*
>>> people have to lie about their national fonts being "ISO".
>>
>> Well, in an ideal world it shouldn't be necessary to lie...
>> I think I understood what you mean. My idea is that the encoding
>> between userland and tty is completely independent of any downloaded
>> font. The visible result is always similar, to a varying degree of
>> perfection. Which means that characters are replaced by some replacement
>> sign if the fonts available to the display can't represent it, or some
>> less perfect approximation.
>>
>> In any case, it is the wsdisplay emulation which defines which
>> glyph is to be displayed in which input.
>
> But this is not exactly how a real vt does it. ÂI tried to explain
> that in my previous email. ÂDo I need to make a 3rd attempt at that?
>
> For ASCII, NCRs, DEC technical and other DEC charsets a real terminal
> would use a ROM font. ÂIn that case terminal "knows" what it displays
> by construction. ÂIn wsemul case we need to build an equivalent of
> that ROM (so that wsemul has the same "knowledge") by asking the
> wsdisplay (via emulops): "Hey, what should I pass to putchar to draw a
> lower left corner line-drawing glyph?".
>
> But when a loadable font is used you absolutely don't want to know
> anything about "encoding" of data coming to the tty from the system.
> That's how real vt220 and later version were. ÂYou load something, you
> designate it into, say, G2, then you map G2 into GR. ÂThen you just
> send bytes to the terminal and it displays them obediently.

I think that we are getting lost in technicalities here. The real
terminal hardware had a ROM which said how a particular byte should be
rendered as a glyph and loadable fonts were just a way to repurpose a
a terminal without having to replace the ROM.
There was no mapping or translation in the terminal, just a
byte-indexed array of glyph bitmaps.

The various legacy character sets are in part the result of hardware
limitations of the terminals.

Obviously, if you replaced a terminal with fixed ROM with a new one
which would support uploading fonts the simplest way to do that would
be to copy the ROM of the old terminal. If you wanted to support a
different language which requires some accented characters not present
in the default font you would want the new font to reuse as many
glyphs from the old one as possible. Both to simplify development of
the new font and to limit the damage caused by font failing to load.
This gave birth to odd encodings like national encoding that places
the national characters at places which hold similarly looking (though
often non-letter) glyphs in the default ROM font. Failing to load the
font renders text ugly but readable.

Different encodings have crept in because different system vendors
would not use the same encoding for the same language in part because
of IP rights and in part because of vendor lock-in practices. In part
this might be caused by terminal limitations - the line drawing
characters might be supported only on certain positions, for example.

If you want NetBSD compatible with various limited hardware you
probably *do* want the ability to remap stuff in kernel so that these
differences are hidden from the applications and you do not need a
different locale for every old crappy hardware out there.

Or you might say no to this, import some inversion of luit(1) and say
that anybody running on real hardware (as opposed to a terminal
emulation which can display anything and everything you can come up
with) should use it.


>
> This is why people are forced to use "ISO" currently - because due to
> unicode and latin1 numerology they will get from the current code the
> "leave me alone" 1:1 mapping for their 160-255 range.
>
> You really want wsemul to ask wsdisplay a different kind of quesion
> here, a companion of mapchar with the semantic of the second "int c"
> argument changed. ÂFor most 8-bit fonts the mapping gonna be 1:1, but
> a non-trivial mapping gonna be needed for text-mode vga dos codepages
> fonts used to display some other charset. ÂWe can clone wsemul_vt100,
> but this mapping problem - how to properly express the kind of quesion
> like the above - is *not* in wsemul.
>
> Speaking about legacy systems, if you get the loadable fonts right,
> you not just solve the national 8-bit charset problem, you actually
> *improve* support for legacy systems, because then you can do stuff
> like this (just what are unicode codepoints for some of the glyphs
> used there?):
>
> Â Âhttp://vt100.net/dec/vt320/snap-animacp-large.png
> Â Âhttp://vt100.net/dec/vt320/soft_characters
>
> using wscons on your jornada 620lx (that is too underpowered to run
> X11) that just happened to be in your pocket when you need it. Â(perl
> script to convert donloadable vt font to a wsfont is left as an
> exercise to the user).
>
> Continuing with the legacy theme, let's say for the sake of argument
> that I want to use a charset like CSX+ (for sanskrit transliteration).
> Unicode mapping for some of the codepoints may contain 4 unicode
> characters. ÂSomehow I seriously doubt we gonna have *that* level of
> unicode support in our kernel any time soon. ÂYet it's *trivial* with
> the proper semantic - you can just make a font that matches your
> custom esoteric encoding, load it and it just works.
>
>
> I've been trying to start a technical discussion on this topic because
> I belive that I'm correctly outlining the way real terminals work, the
> way wsemul_vt100 is lacking in this area and that fixing this problem
> gonna improve support for both legacy systems and for users of
> national 8-bit charsets.

Who does still use an 8bit national charset (except Windows) ? It's so
last century.

I think the way to go these days is Unicode support. It is not
sufficient for some arcane things (and some not so arcane things in
some exotic languages) but for displaying messages on the terminal in
the user's national language is should suffice.

This unfortunately means using UTF-8 as it is the representation
compatible with legacy applications running in "C" locale.

The representation suitable for lookup tables and such is UTF-32,
though. This means that the UTF-8 coming into the kernel should be
converted to UTF-32.

What I would like to see is

 - support for full unicode on console devices which are backed by
graphics, not a hardware terminal or VGA text mode
 - support for uploading fonts and encoding tables so that an
application can be written which displays pretty much all Unicode on a
VGA text console by changing the font and font mapping on the fly

Both should be supported by loading differently sized fonts (up to 256
or 512 glyphs for VGA, up to full Unicode for graphics) and a
corresponding map table which says what Unicode glyph is where in the
font. Installing a map should also support odd hardware terminals
without the need to use special locale because you can say what the
ROM of your terminal looks like.

The inverse is key mapping. Since it is possible to write on wscons I
guess it works already. You want to convert arbitrary strings of
keycodes to some symbolic key representation (hardware key mapping)
and then have a conversion of this symbolic key representation (+
modifier state) to strings that should be sent to the terminal (user
key mapping).

Thanks

Michal


Home | Main Index | Thread Index | Old Index