NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

re: kern/49710: i386 radeondrmkms panic when starting Xorg

i'm curious what the current status of this PR is.  on my similarly
behaving systems, it fails slightly less poorly now, while still
failing pretty badly.

>  The 'xterm' instance spawned a new shell process.  Sending a SIGHUP to that
>  shell process caused it and the 'xterm' process to exit as expected.  As it
>  was the last client, the Xserver should have reset itself, but it instead
>  continues to alternate between RUN state and waiting in "radfence".

this sounds like the same problem i see on R200.  so far it seems
that we're hitting a case where the hardware is hung or isn't 
performing what we expect, and giving indication it is done.

>  [   420.146] [mi] EQ overflowing. The server is probably stuck in an infinite loop.
>  Sending SIGKILL to the X server renders the machine mostly unresponsive.
>  The terminal driver still echoes/translates characters, but that's about
>  all.

in my testing, this seems to be related to the fact that some IO
occurs during signal handling, but otherwise we're spinning in
userland, polling the kernel if an operation is completed, and
each time we notice it isn't, we can do stuff like update the
mouse pointer, or handle other async IO>
>  I was not able to reproduce a similar backtrace showing "mutext_spin_enter()".
>  Instead, this time I got the following:
>  [BREAK sent]
[ ... ]
>  radeon_fence_wait_seq(c2fe0870,c2fe0000,5,0,5,0,0,0,0,0) at netbsd:radeon_fence_wait_seq+0x125
>  radeon_fence_wait(c3b22abc,1,1,c9c210,1,c2fe0701,c3b22abc,c38d0484,0,c38d040c) at netbsd:radeon_fence_wait+0x6b
>  ttm_bo_wait(c38d043c,1,1,0,c38d043c,c2fe0000,c38d0598,c38d040c,db4dce44,c061c29b) at netbsd:ttm_bo_wait+0x8a
>  radeon_bo_wait(c38d040c,0,0,0,0,0,0,c09385d0,c3c9c1f8,c374f70c) at netbsd:radeon_bo_wait+0xac
[ ... ]
>  >  this seems like a deadlock, and the above will show info about the
>  >  lock being waited on.
>  There are similarities to the previous backtrace, but not the specific
>  item of interest.  What should I consider of interest in this backtrace,
>  or any like it in future trials?

actually, i've pretty much convinced myself this problem is the
same basic problem on see on my my R200 cards/systems (it's
happening similar for a PCI 9250 card, and an laptop agp 9000-M).

it reminds me of my failed attempts to port drm a long long time
ago where the CP busy flag would never clear, and the system
would basically hang.  i'd seen this looping failure in both the
drm kernel code, and the ati X ddx driver.  the latter was less
annoying to determine :)

these went away with the original drm code in sys/dev/drm
(commited by drochner@.)


Home | Main Index | Thread Index | Old Index