Subject: xsrc/37054: WindowMaker triggers a crash in Xlib on sparc64
To: None <xsrc-manager@netbsd.org, gnats-admin@netbsd.org,>
From: Pierre Pronchery <khorben@defora.org>
List: netbsd-bugs
Date: 10/02/2007 11:10:00
>Number:         37054
>Category:       xsrc
>Synopsis:       WindowMaker triggers a crash in Xlib on sparc64
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    xsrc-manager
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Oct 02 11:10:00 +0000 2007
>Originator:     khorben@defora.org
>Release:        NetBSD 4.99.31
>Organization:

>Environment:
System: NetBSD exxh.defora.lan 4.99.31 NetBSD 4.99.31 (GENERIC) #0: Sun Sep 30 18:22:11 CEST 2007 khorben@exxh.defora.lan:/usr/obj/sys/arch/sparc64/compile/GENERIC sparc64
Architecture: sparc64
Machine: sparc64
>Description:

WindowMaker crashes at startup with the following information:
Core was generated by `wmaker'.
Program terminated with signal 10, Bus error.
#0  0x000000004246a16c in _XData32 () from /usr/X11R6/lib/libX11.so.6
(gdb) bt
#0  0x000000004246a16c in _XData32 () from /usr/X11R6/lib/libX11.so.6
#1  0x0000000042490b24 in XChangeProperty () from /usr/X11R6/lib/libX11.so.6
#2  0x00000000001582c0 in main ()

I think I found the incriminated code in Xlib:
$ vi /usr/xsrc/xfree/xc/lib/X11/XlibInt.c

where there are two implementations of XData32()

3021 #ifdef LONG64
3022 int
3023 _XData32(
3024     Display *dpy,
3025     register long *data,
3026     unsigned len)
3027 {
3028     register int *buf;
3029     register long i;
3030 
3031     while (len) {
3032         buf = (int *)dpy->bufptr;
3033         i = dpy->bufmax - (char *)buf;
3034         if (!i) {
3035             _XFlush(dpy);
3036             continue;
3037         }
3038         if (len < i)
3039             i = len;
3040         dpy->bufptr = (char *)buf + i;
3041         len -= i;
3042         i >>= 2;
3043         while (--i >= 0)
3044             *buf++ = *data++;
3045     }
3046     return 0;
3047 }
3048 #endif /* LONG64 */

and:

3050 #ifdef WORD64
[...]
3151 void _XData32(
3152     Display *dpy,
3153     long *data,
3154     unsigned len,
3155 {
3156     char packbuffer[PACKBUFFERSIZE];
3157     unsigned nunits = PACKBUFFERSIZE >> 2;
3158 
3159     for (; len > PACKBUFFERSIZE; len -= PACKBUFFERSIZE, data += nunits) {
3160         doData32 (dpy, data, PACKBUFFERSIZE, packbuffer);
3161     }
3162     if (len) doData32 (dpy, data, len, packbuffer);
3163 }

I think the first one is used, because if when disassembling the current frame,
one gets:

(gdb) disas _XData32
Dump of assembler code for function _XData32:
0x000000004246a120 <_XData32+0>:        save  %sp, -192, %sp
0x000000004246a124 <_XData32+4>:        cmp  %i2, 0

which totally looks like "while (len) {" line 3031 (there would probably be a
substraction first otherwise). This seems to be confirmed when looking at where
the code actually crashes, in this routine:

0x000000004246a16c <_XData32+76>:       ldx  [ %g2 ], %g1
0x000000004246a170 <_XData32+80>:       inc  %g4
0x000000004246a174 <_XData32+84>:       add  %g2, 8, %g2
0x000000004246a178 <_XData32+88>:       cmp  %g5, %g4
---Type <return> to continue, or q <return> to quit---
0x000000004246a17c <_XData32+92>:       st  %g1, [ %g3 ]
0x000000004246a180 <_XData32+96>:       bne  %xcc, 0x4246a16c <_XData32+76>

which to me looks like lines 3043 and 3044:

3043         while (--i >= 0)
3044             *buf++ = *data++;

where buf points to an int and data to a long:

(gdb) print (long*)$g2
$5 = (long *) 0xffffffffffffc09c
(gdb) print (int*)$g1
$6 = (int *) 0x2ac01c

where g2 is probably data and g1 *data. It is later stored to *g3, probably
holding the address of buf. My guess is that g2 is not aligned on a 64 bits
boundary here (sizeof(long) on this arch) and therefore triggers a bus error.

I suppose the code that should have been included is not this function. I would
be glad if someone could help me fix that.

>How-To-Repeat:
	WindowMaker from pkgsrc exhibits the problem at launch time. It catches
	the error and proposes to dump core.

	I compiled X from src this way:

	$ ./build.sh -O /usr/obj -T /usr/tools -U -u -x tools build install=/

	and starting with empty /usr/obj and empty /usr/tools.
>Fix:
	If my analysis is correct. there is a problem either in this function or
	when compiling X from xsrc.

>Unformatted:
 Snapshot from somewhere between 27/10/2007 and 29/10/2007 iirc