tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: curses vs non-ASCII



On Fri, Nov 20, 2015 at 07:04:19AM -0500, Mouse wrote:
> > OK, I have just caught up with this thread and I do find it a little
> > disconcerting that you claim things are not being looked after when
> > you didn't get a response for a few days.
> 
> I'd have to see the specific bit of text you're talking about to be
> sure, but I suspect that the "not being looked after" was more general,
> more like "5.2 shipped with it being this hard to figure out how to get
> the previous behaviour" than "why is nobody replying to my list mail".
> I know it can take a while for people to get to mail, and it's not as
> if anyone owes me a response in any case.
> 

Yes, it was the "not being looked after" bit that resulted in that
comment.  I know that the internationalisation stuff is hard and that
there is a presumption that if you move away from the C locale that you
implicitly know what you are doing.

> > This totally fails in a couple of cases, [...]
> 
> Sure.  I can understand why curses would want to understand such things
> as dead diacritics and multi-cell characters, same as I can understand
> why curses would want to understand what characters are printable.
> That curses is capable of those is not at issue here for me.  The issue
> I have - to the extent that I have one - is that the i18n stuff makes
> it harder than I think it should be to get the old behaviour back.
> 

Though it really could be argued that the old behaviour is broken since
it breaks the terminal independence aspect, the old behaviour left it up
to the terminal to interpret the character and this may have led to a
difference in cursor positioning between the screen and curses.  Now
unknown characters are not fed to the screen, I guess by setting the
locale you are indicating that you have a screen that is capable of
interpreting the character set that you are attempting to display on it.
Not very helpful but probably more correct than random cursor motions.

> >> Yes, UTF-8 was the first thing I thought of when I saw the behaviour
> >> too.  I didn't investigate in great detail; I was more concerned
> >> with making it go away than I was with exploring the envelope of the
> >> behaviour.
> 
> I assume you have by now noticed that I managed to reproduce it with a
> small test program?
> 

Yes I did and I see that others have said it works fine for them :)
This is the fun thing about testing screen based applications, if
something goes wrong then it can be anything from the characters printed
to bugs in libraries to bugs in terminal emulators to bugs in the coders
expectations.  Weeding out where the problem lies can be very difficult.

> > At a guess I would say that you have stumbled across a character
> > sequence that happens to be valid multibyte character but I haven't
> > looked closely.  What does mbrtowc(3) do with that sequence?
> 
> I don't know yet, and I don't really have time to test that now.  I
> should be able to put together a test within a day or two.
> 

That my give some insight.

-- 
Brett Lymn
Let go, or be dragged - Zen proverb.


Home | Main Index | Thread Index | Old Index