Subject: pkg/37419: x11/xterm 229 mishandles combining characters
To: None <pkg-manager@netbsd.org, gnats-admin@netbsd.org,>
From: None <rhialto@falu.nl>
List: pkgsrc-bugs
Date: 11/22/2007 11:50:00
>Number:         37419
>Category:       pkg
>Synopsis:       x11/xterm 229 mishandles combining characters
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    pkg-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Nov 22 11:50:00 +0000 2007
>Originator:     Rhialto
>Release:        NetBSD 3.0
>Organization:
	
>Environment:
	
	
System: NetBSD radl.falu.nl 3.0 NetBSD 3.0 (Radl's Pervasion of the Incorrect Chord) #2: Sun Nov 26 21:46:18 CET 2006 root@radl.falu.nl:/usr/src/sys/arch/amd64/compile/RADL amd64
Architecture: x86_64
Machine: amd64
>Description:
	If you cat(1) the UTF-8 demo file from
	http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-demo.txt to an
	xterm started with 
	
	    LC_CTYPE=en_US.UTF-8 ./xterm -u8 -class UXTerm
	    
	(which is basically what the uxterm script does), the combining
	characters don't come out combined. the part above the text

	 (The above is a two-column text. If combining characters are handled
  correctly, the lines of the second column should be aligned with the
  | character above.)

	is misformatted, for instance.

	Worse, it seems that if you start it without the
	"LC_CTYPE=en_US.UTF-8", catting that file hangs xterm.

	The stock NetBSD 3.0 xterm works right:  version "XFree86
	4.3.99.903(184)".

	I originally found this problem on FreeBSD, where this issue now
	has pr number ports/118196.

>How-To-Repeat:
	See above.
>Fix:
	By accident, I found a workaround, involving luit(1) and a
	misspelling of the locale name. Apparently "utf-8" instead of
	"UTF-8" causes luit to do an almost-identity mapping which
	somehow avoids the problem:

	LC_CTYPE=en_US.utf-8 xterm -lc -class UXTerm 

-Olaf.
-- 
___ Olaf 'Rhialto' Seibert      -- You author it, and I'll reader it.
\X/ rhialto/at/xs4all.nl        -- Cetero censeo "authored" delendum esse.

>Unformatted: