Subject: Re: [Summer of Code]Wide character support for curses
To: Ruibiao Qiu <ruibiao@arl.wustl.edu>
From: Thomas Dickey <dickey@radix.net>
List: tech-userlevel
Date: 06/26/2005 15:52:56
On Sun, Jun 26, 2005 at 02:19:14PM -0500, Ruibiao Qiu wrote:
> What I wanted to say is that many believe that ncurses (with its rich set 
> of features) consumes more resources than NetBSD's striped down version of 
> curses libraries that works on as many platforms as possible.

actually ncurses works on more platforms (with essentially the same
functionality on each).  Both ncurses and NetBSD curses don't contain
many system dependencies (outside of mouse support and sigwinch handling).
As NetBSD curses is modified to fill in the other X/Open curses functions,
the amount of space used for termcap vs terminfo is a smaller fraction.
 
> >None of the mailing list comments touched on the hard parts, nor does this:
> >	http://www.arl.wustl.edu/~ruibiao/SoC/NetBSD-curses/
> >(for instance, input is not mentioned).
> 
> Could you be more specific about the hard parts based on your ncurses 
> development experience?  That will be very valuable to our project.

For instance assembling multibyte/multicolumn characters can be a
problem, since you have to keep in mind that some columns on the screen
do not actually contain a character.  I assume you also would do this
with an attribute, but coding it is not straightforward, since the logic
that surrounds waddch is not necessarily aware of the distinctions.
 
> I did look into the input routines, but I thought there is no major changes 
> needed there.  It uses getchar() to get input characters from input, does 

not getchar(), but getch()

> multi-character assembling, and calls waddch() for echo.  I will revisit 
> this during the design phase.

true - but it also has to know about some of the internals of waddch()
(or the latter has to be smarter), since the intermediate display as
bytes are accumulated and painted won't be complete.  As it is accumulated,
you may find that there is not enough room on the line, so the incomplete
multi-column character has to be erased and started on the next line.

Another complication which occurs to me at the moment is that most
applications would prefer reading a wide character.  wgetch() returns
bytes from the encoding of the wide character, and waddch() adds bytes
in the other direction.  wget_wch() is the function that X/Open lists
for reading a wide character.  In ncurses, I've got the wide-character
functions where possible built on top of the narrow-character functions -
even doing that takes time to build up.

btw - Looking back at your webpage - waddbytes() is NetBSD-specific.
 
> >Looking forward to the benchmarking results which you'll publish in 
> >October.
> 
> I would highly appreciate it if you suggest some specific tests that we 
> should include in our benchmark tests.  Thanks.

I'm interested in the smaller- and faster-metrics.  Probably you won't
complete all of the functions in ncurses, but by focusing on those that
can be used to construct a usable application, you could measure the
static and runtime sizes with the different variations (narrow/wide
NetBSD/ncurses).  I'd suggest a simple file-viewer - it would sidestep
the issue of multibyte input, could be easily scripted (and hence
measured in a variety of ways, e.g., timing for refresh, scrolling etc). 
Multibyte input could be added to that (e.g., a search command using
winnstr to fetch data from the display).  Testing it with data that has
nonspacing characters and data that uses double-width characters would
exercise the waddch() logic.

-- 
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net