Subject: kern/29824: Xserver triggers threading problem
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <macallan18@earthlink.net>
List: netbsd-bugs
Date: 03/29/2005 12:35:00
>Number:         29824
>Category:       kern
>Synopsis:       Xserver triggers threading problem
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 29 12:35:00 +0000 2005
>Originator:     Michael Lorenz
>Release:        -current with -current userland
>Organization:
>Environment:
NetBSD Inishowen 3.99.2 NetBSD 3.99.2 (INISHOWEN) #325: Mon Mar 28 08:38:50 EST 2005  root@Inishowen:/data/src/sys/arch/sparc64/compile/INISHOWEN sparc64
>Description:
I'm running XFree86 4.5 compiled from xsrc on a Sun Ultra 10. After a while the Xserver appears to lock up, usually when there's some IO / paging going on. Ssh and so on still work so I can still login and see what's going on - XFree86 just sits there consuming all CPU cycles it can get. Loading it into gdb gives this:
(gdb) bt
#0  0x0000000040a132a4 in pthread__lock_ras_end ()
   from /usr/lib/libpthread.so.0
#1  0x0000000000000008 in ?? ()
#2  0x0000000040a134e8 in pthread_spinlock () from /usr/lib/libpthread.so.0
#3  0x0000000040a0bb38 in pthread_sigmask () from /usr/lib/libpthread.so.0
#4  0x000000000016c598 in xf86BlockSIGIO ()
#5  0x000000000014d650 in xf86SigioReadInput ()
#6  0x000000000016c248 in xf86SIGIO ()
#7  <signal handler called>
#8  0xffffaba100000000 in ?? ()
#9  0x0000000000000008 in ?? ()
#10 0x000000000016c5dc in xf86UnblockSIGIO ()
#11 0x000000000016c248 in xf86SIGIO ()
#12 <signal handler called>
#13 0xffffb1a100000000 in ?? ()
#14 0x0000000001347bb8 in ?? ()
#15 0xffffffffffffb16d in ?? ()
#16 0x000000000085915c in ?? ()
#17 0x00000000008598cc in ?? ()
#18 0x0000000000859c14 in ?? ()
#19 0x0000000000859dc8 in ?? ()
#20 0x00000000007a038c in ?? ()
#21 0x00000000007a20d8 in ?? ()
#22 0x00000000001ffd4c in miSpriteCopyArea ()
#23 0x000000000018a4a0 in ProcCopyArea ()
#24 0x0000000000187f78 in Dispatch ()
#25 0x000000000019b9f8 in main ()
#26 0x000000000012cb38 in ___start ()
... or something very similar. It's always in pthread_spinlock. The process can be killed and restarted ( just to lock up again pretty soon ) but apparently some kernel internals got screwed up, the machine doesn't reboot anymore - it shuts down the network and that's it. 

side note: something similar happens on macppc, but apparently without affecting the kernel since my S900 ( also running 3.99.2 ) still reboots after killing a deadlocked Xserver.
>How-To-Repeat:
Compile XFree86 4.5 from xsrc, I used the onboard Rage 3D Pro of an U10 but I don't think that matters here, run some window manager and do some work. After a while the Xserver will lock up.
>Fix:
Stick to 4.4 for now, at least it doesn't have /this/ problem.