tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: curses vs non-ASCII



>>> However, I do have reason to think mbrtowc is part of the
>>> precipitate here; specifically, by default, the 5.2 curses not only
>>> drops the non-ASCII octet, but sometimes eats the following octet
>>> as well.  Even without any setlocale().
>> Test case, please.

I have a test case.  The program is appended after my signature.  In
case such details matter (though I can't imagine why), I called it
zmbtest.c and I built it with "/usr/bin/gcc -o zmbtest zmbtest.c -g
-lcurses -ltermcap" on amd64 (both my 5.2 machines are amd64).  As a
read of the code will tell you, typing anything but . will make it
regenerate and redisplay the output, whereas typing . will make it
clean up and exit.

On running this on 1.4T or 4.0.1, I see the list of strings, with the
glyph for \267 (eg, centred-dot when using 8859-1) before the one it
goes with in the code: exactly what I'd expect from "throw what you're
given at the screen".  Typing space, even repeatedly, does not change
the display.

On 5.2, on startup, I see the list displayed as

  54-40: Trusted By Millions
  Abba: Abba
  Abba: Abba Live
  Abba: Thank You For The Music (Disc 1)
  Abba: Thank You For The Music (Disc 2)
 Abba: Thank You For The Music (Disc 3)
  Abba: Thank You For The Music (Disc 4)
  Abba: Voulez-Vous
  Ace Of Base: Best Of

Then, on typing space, the line in question changes to

Abba: Thank You For The Music (Disc 3))

Another space changes it to

 Abba: Thank You For The Music (Disc ))

and further spaces alternate between those last two.

I'd be interested to know how this compares to your experience of the
same test program.  In particular, if you _don't_ see any such
misbehaviour, I'd like to try to track down the relevant difference.
When doing my test run, "printenv | egrep LANG\|LC_" produced no output
(though, given the lack of a setlocale() call, I would not expect that
to matter).

Running it under script(1) and examining the typescript (with
charset-paranoid tools like "hexdump -C") makes it clear this is not
due to the terminal emulator mishandling anything.  I also started up a
stock xterm and ran the test in that; I see the same (mis)behaviour
there as in the terminal emulator I've been using for most of this
work.  This is equally true with xterm from /usr/X11R7/bin on the 5.2
machine or the xterm from X11R6.4p3 (those are the two at ready hand).

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse%rodents-montreal.org@localhost
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

#include <curses.h>

static void replot(void)
{
 move(0,0);
 addstr("  54-40: Trusted By Millions");
 move(1,0);
 addstr("  Abba: Abba");
 move(2,0);
 addstr("  Abba: Abba Live");
 move(3,0);
 addstr("  Abba: Thank You For The Music (Disc 1)");
 move(4,0);
 addstr("  Abba: Thank You For The Music (Disc 2)");
 move(5,0);
 addstr("\267 Abba: Thank You For The Music (Disc 3)");
 move(6,0);
 addstr("  Abba: Thank You For The Music (Disc 4)");
 move(7,0);
 addstr("  Abba: Voulez-Vous");
 move(8,0);
 addstr("  Ace Of Base: Best Of");
 move(LINES-1,COLS-1);
}

int main(void)
{
 initscr();
 noecho();
 cbreak();
 clearok(stdscr,TRUE);
 while (1)
  { replot();
    refresh();
    if (getch() == '.') break;
  }
 endwin();
 return(0);
}


Home | Main Index | Thread Index | Old Index