re: kern/49710: i386 radeondrmkms panic when starting Xorg

To: kern-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost,jdbaker%mylinuxisp.com@localhost
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
From: "John D. Baker" <jdbaker%mylinuxisp.com@localhost>
Date: Tue, 10 Mar 2015 01:25:00 +0000 (UTC)

The following reply was made to PR kern/49710; it has been noted by GNATS.

From: "John D. Baker" <jdbaker%mylinuxisp.com@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Date: Mon, 9 Mar 2015 20:20:36 -0500 (CDT)

 A little more detail on what happens without "NoAccel":

 On serial console, starting X server with:

 $ X -retro &

 The retro stipple pattern is not displayed.  The framebuffer console
 cursor is still visible in the upper left corner of the display.  The
 X mouse cursor is visible and tracks mouse movements correctly.  One
 can move the X cursor over the framebuffer cursor.  The X cursor appears
 above (Z axis) the framebuffer cursor and when moved off, the framebuffer
 cursor is unchanged.

 The X server sits idle in "select" until the first client connects to
 it.  Thereafter, it alternates between "RUN" and "radfence" when observed
 with 'top'.  The client window ('xterm' in my tests) never appears.
 Moving the cursor to the area where it should be and typing 'exit'
 <return> has no effect (so it's not that the window is simply invisible).

 The 'xterm' instance spawned a new shell process.  Sending a SIGHUP to that
 shell process caused it and the 'xterm' process to exit as expected.  As it
 was the last client, the Xserver should have reset itself, but it instead
 continues to alternate between RUN state and waiting in "radfence".

 Attaching 'ktruss' to the Xorg process reveals:

 # ktruss -i -p 114
    114      1 Xorg     SIGALRM caught handler=0x8190040 mask=0x0 code=0x0
    114      1 Xorg     setcontext(0xbfbfe3ac)      JUSTRETURN
    114      1 Xorg     emul(netbsd)
    114      1 Xorg     ioctl(0xb, _IOR('d',0x64,0x8), 0xbfbfe6fc) Err#4 EINTR
        "\^C\0\0\0\0\0\0\0"
    114      1 Xorg     SIGALRM caught handler=0x8190040 mask=0x0 code=0x0
    114      1 Xorg     setcontext(0xbfbfe3ac)      JUSTRETURN
    114      1 Xorg     ioctl(0xb, _IOR('d',0x64,0x8), 0xbfbfe6fc) Err#4 EINTR
        "\^C\0\0\0\0\0\0\0"
    114      1 Xorg     SIGALRM caught handler=0x8190040 mask=0x0 code=0x0
    114      1 Xorg     setcontext(0xbfbfe3ac)      JUSTRETURN
    114      1 Xorg     ioctl(0xb, _IOR('d',0x64,0x8), 0xbfbfe6fc) Err#4 EINTR
        "\^C\0\0\0\0\0\0\0"
 [...]

 The only additional information in the "Xorg.0.log" file is the line noted
 before:

 [   420.146] [mi] EQ overflowing. The server is probably stuck in an infinite loop.

 Sending SIGKILL to the X server renders the machine mostly unresponsive.
 The terminal driver still echoes/translates characters, but that's about
 all.

 On Sat, 7 Mar 2015, matthew green wrote:

 >  can you reproduce this?  if so, please also run "show lock c2fddb54"
 >  from ddb -- where the first argument to mutex_spin_enter is the
 >  argument to show lock.

 I was not able to reproduce a similar backtrace showing "mutext_spin_enter()".

 Instead, this time I got the following:

 [BREAK sent]
 fatal breakpoint trap in supervisor mode
 trap type 1 code 0 eip c02518c4 cs 8 eflags 200202 cr2 bbae74f8 ilevel 8 esp da782f6c
 curlwp 0xc38e4d40 pid 114 lid 1 lowest kstack 0xdb4da2c0
 Stopped in pid 114.1 (Xorg) at  netbsd:breakpoint+0x4:  popl    %ebp
 db{0}> bt
 breakpoint(c0aecac0,3f8,5,c0b41440,c0b41440,c046bb9e,c34b3608,c34b3580,c34e16b6,
 c34e2000) at netbsd:breakpoint+0x4
 comintr(c34b34c8,db4dcc08,0,0,0,0,0,0,0,0) at netbsd:comintr+0x5f5
 --- switch to interrupt stack ---
 Xintr_legacy4() at netbsd:Xintr_legacy4+0xc3
 --- interrupt ---
 sigispending(c38e4d40,0,c38e4d40,0,c0446847,1,0,c2fe0b54,c38e4dac,c2fe0b50) at n
 etbsd:sigispending+0xc
 sleepq_block(32,1,c09bb183,c0b16030,c0446847,c2fbfb00,c2fc1d40,32,40a14,c2fe0000
 ) at netbsd:sleepq_block+0x106
 cv_timedwait_sig(c2fe0b54,c2fe0b50,32,bffb,0,0,c0000000,1003fff,c2fe08c0,db4dcd7
 0) at netbsd:cv_timedwait_sig+0x103
 radeon_fence_wait_seq(c2fe0870,c2fe0000,5,0,5,0,0,0,0,0) at netbsd:radeon_fence_
 wait_seq+0x125
 radeon_fence_wait(c3b22abc,1,1,c9c210,1,c2fe0701,c3b22abc,c38d0484,0,c38d040c) a
 t netbsd:radeon_fence_wait+0x6b
 ttm_bo_wait(c38d043c,1,1,0,c38d043c,c2fe0000,c38d0598,c38d040c,db4dce44,c061c29b
 ) at netbsd:ttm_bo_wait+0x8a
 radeon_bo_wait(c38d040c,0,0,0,0,0,0,c09385d0,c3c9c1f8,c374f70c) at netbsd:radeon
 _bo_wait+0xac
 radeon_gem_wait_idle_ioctl(c374f70c,db4dceb0,c3c9c1f8,0,0,0,db4dcf68,c3fa4400,0,
 db4dcf3c) at netbsd:radeon_gem_wait_idle_ioctl+0x4b
 drm_ioctl(c3fa4400,80086464,db4dceb0,0,0,0,0,0,8,0) at netbsd:drm_ioctl+0x135
 sys_ioctl(c38e4d40,db4dcf68,db4dcf60,db4dcf60,fffffffe,db4dcf68,c0b13898,0,0,b) a
 t netbsd:sys_ioctl+0x1ae
 syscall() at netbsd:syscall+0x16f
 --- syscall (number 54) ---
 bb771f77:
 db{0}>

 >  this seems like a deadlock, and the above will show info about the
 >  lock being waited on.

 There are similarities to the previous backtrace, but not the specific
 item of interest.  What should I consider of interest in this backtrace,
 or any like it in future trials?

 -- 
 |/"\ John D. Baker, KN5UKS               NetBSD     Darwin/MacOS X
 |\ / jdbaker[snail]mylinuxisp[flyspeck]com    OpenBSD            FreeBSD
 | X  No HTML/proprietary data in email.   BSD just sits there and works!
 |/ \ GPGkeyID:  D703 4A7E 479F 63F8 D3F4  BD99 9572 8F23 E4AD 1645

Follow-Ups:
- re: kern/49710: i386 radeondrmkms panic when starting Xorg
  - From: matthew green

Prev by Date: NetBSD Nightly Trouble Ticket Report
Next by Date: PR/49640 CVS commit: src/lib/libc/stdlib
Previous by Thread: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Next by Thread: re: kern/49710: i386 radeondrmkms panic when starting Xorg
Indexes:

Home | Main Index | Thread Index | Old Index