Subject: Re: Type for wide characters
To: None <itojun@iijlab.net>
From: Julian Coleman <J.D.Coleman@newcastle.ac.uk>
List: tech-userlevel
Date: 10/05/1999 10:10:27
itojun@iijlab.net wrote:
> 	(I have no access to wchar_t capable libcurses standard docs, so
> 	correct me if I'm wrong)

The SUS2 documentation is available at :

  http://www.opengroup.org/onlinepubs/007908799/toc.htm

Follow 'X/Open Curses' and 'Curses Overview'.

> 	Could you please let me know about more detail?  wchar_t is for
> 	handling both wide chars and normal (ASCII) chars in a uniform manner.
> 	Therefore, it makes more sense to me if you always use wchar_t and
> 	attr_t separated.

The BSD curses uses a struct for storing a character and its attributes.
Originally :

	typedef struct {
		char	ch;	/* Character */
		char	attr;	/* Attributes */
	} __LDATA;

The SUS spec says that 8 bit characters and attributes are stored in a
chtype.  Wide characters use wchar_t and wide character attributes use
attr_t.

I was hoping to make wchar_t and chtype the same type and make this struct :

	typedef struct {
		chtype	ch;	/* Character and attributes */
		attr_t	attr;	/* Wide attributes */
	} __LDATA;
	
Now, this is fine as long as wide characters don't use more than 16 bits
(17 at a pinch), as the 8 bit attributes can be stored along with wide
characters in chtype.  However, if CJK ideogram uses more than 65536, then
this is not possible.  I'm inclined to use :

	typedef struct {
		wchar_t	ch;	/* Character */
		attr_t	attr;	/* Wide attributes */
	} __LDATA;

internally and do the chtype <-> wchar_t + attr_t conversion in those
functions that are passed or return chtypes.

Now, to the crux of my original message.  Bad things (TM) happen if the
two members of this struct are not the same size (due to the way the library
does comparisons).  Thus it would be useful to know what type to use for
wchar_t, so that I can make attr_t the same.  I've had a request to make it
a fixed width type, so how about u_int32_t?

> 	Of course, for characters with width-on-screen bigger than normal
> 	ASCII chars, you will need "offset on screen" field in backing store
> 	structure.  You may want to fill the field with 0 for normal (ASCII)
> 	characters.

Hmm.  The spec has this to say :

    `Some character sets define multi-column characters that occupy more
     than one column position when displayed on the screen. 

     Writing a character whose width is greater than the width of the
     destination window is an error.'

Admittedly, at the moment, I'm concentrating on getting our curses to match
the spec for the 8 bit characters and I haven't considered how to handle
wide characters in too much detail.  The main thing is that the interface
is (IMHO) sufficiently different to require a major version bump, and, thus,
I'd like the internals to be able to handle wide characters without another
version bump later.

J

-- 
                         Of course it runs NetBSD
                          http://www.netbsd.org/