Re: port-i386/53364: System crashes soon after X server is started with viadrm driver


With LOCKDEBUG enabled it crashed in a different place:

from dmesg:
panic: mutex_vector_enter,517: uninitialized lock (lock=0xc375edc4,

0xc011dfce in maybe_dump (howto=260) at ../../../../arch/i386/i386/machdep.c:789
789     ../../../../arch/i386/i386/machdep.c: No such file or directory.
#0  0xc011dfce in maybe_dump (howto=260) at
#1  cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at
#2  0xc088b342 in vpanic (fmt=fmt@entry=0xc0c77808 "%s,%zu:
uninitialized lock (lock=%p, from=%08x)",
    ap=ap@entry=0xdb74bbcc "\350\327\274\300\005\002") at
#3  0xc088b3cc in panic (fmt=fmt@entry=0xc0c77808 "%s,%zu:
uninitialized lock (lock=%p, from=%08x)") at
#4  0xc0884759 in lockdebug_lookup (where=3228707374, lock=0xc375edc4,
line=517, func=0xc0bcd7e8 <__func__.6248> "mutex_vector_enter")
    at ../../../../kern/subr_lockdebug.c:201
#5  lockdebug_wantlock (func=0xc0bcd7e8 <__func__.6248>
"mutex_vector_enter", line=517, lock=0xc375edc4, where=3228707374,
    at ../../../../kern/subr_lockdebug.c:438
#6  0xc0851d9c in mutex_vector_enter (mtx=mtx@entry=0xc375edc4) at
#7  0xc0722a2e in via_dmablit_grab_slot (engine=1, blitq=0xc375ed78)
at ../../../../external/bsd/drm/dist/bsd-core/via_dmablit.c:670
#8  via_dmablit (xfer=0xdb74beac, dev=0xc36a4400) at
#9  via_dma_blit (dev=dev@entry=0xc36a4400,
data=data@entry=0xdb74beac, file_priv=file_priv@entry=0xc3efe0d4)
    at ../../../../external/bsd/drm/dist/bsd-core/via_dmablit.c:791
#10 0xc071c46f in drm_ioctl (kdev=46080, cmd=3223872590,
data=0xdb74beac, flags=67, p=0xc38a8000)
    at ../../../../external/bsd/drm/dist/bsd-core/drm_drv.c:1059
#11 0xc087c17f in cdev_ioctl (dev=46080, cmd=3223872590,
data=0xdb74beac, flag=67, l=0xc38a8000) at
#12 0xc08e5fa6 in spec_ioctl (v=0xdb74bd9c) at
#13 0xc08de5fc in VOP_IOCTL (vp=vp@entry=0xc3de0e38,
command=command@entry=3223872590, data=data@entry=0xdb74beac,
fflag=67, cred=0xc461c480)
    at ../../../../kern/vnode_if.c:610
#14 0xc08d7d4e in vn_ioctl (fp=0xc4612400, com=3223872590,
data=0xdb74beac) at ../../../../kern/vfs_vnops.c:768
#15 0xc0894614 in sys_ioctl (l=0xc38a8000, uap=0xdb74bf68,
retval=0xdb74bf60) at ../../../../kern/sys_generic.c:671
---Type <return> to continue, or q <return> to quit---
#16 0xc0147225 in sy_call (rval=0xdb74bf60, uap=0xdb74bf68,
l=0xc38a8000, sy=0xc0e2a358 <sysent+1080>) at
#17 sy_invoke (code=54, rval=0xdb74bf60, uap=0xdb74bf68, l=0xc38a8000,
sy=0xc0e2a358 <sysent+1080>) at ../../../../sys/syscallvar.h:94
#18 syscall (frame=0xdb74bfa8) at ../../../../arch/x86/x86/syscall.c:144
#19 0xc010063d in Xsyscall ()
#20 0xdb74bfa8 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
784     in ../../../../arch/i386/i386/machdep.c
eax            <unavailable>
ecx            <unavailable>
edx            <unavailable>
ebx            0x104    260
esp            0xdb74bb74       0xdb74bb74
ebp            0xdb74bb8c       0xdb74bb8c
esi            0x8      8
edi            0xdb74bbcc       -613106740
eip            0xc011dfce       0xc011dfce <cpu_reboot+370>

P.S. Currently it seems that issue is likely a duplicate of
kern/48434. For some reason Xorg is starting now properly but fails on
switch to vt or during shutdown (which tries to do the same I guess).

On Mon, Jun 18, 2018 at 5:21 AM, matthew green <> wrote:
>> #7  0xc085141b in mutex_oncpu (owner=4294967280) at
>> ../../../../kern/kern_mutex.c:551
> i see why the fault occurs:
> here owner = 0xfffffff0.  that seems bogus because it should
> be a pointer.  and l_cpu is offset 12 in struct lwp:
>     413 mutex_oncpu(uintptr_t owner)
>     [ ... ]
>     416         lwp_t *l;
>     [ ... ]
>     428         l = (lwp_t *)MUTEX_OWNER(owner);
>     429         ci = l->l_cpu;
> so that explains the 0xfffffffc fault address.
> the real question is why is this invalid.  it should be valid.
>> #8  mutex_vector_enter (mtx=0xc370edc4) at ../../../../kern/kern_mutex.c:560
>> #9  0xc0722bee in via_dmablit_grab_slot (engine=1, blitq=0xc370ed78)
>> at ../../../../external/bsd/drm/dist/bsd-core/via_dmablit.c:670
>> #10 via_dmablit (xfer=0xdc161eac, dev=0xc3652400) at
>> ../../../../external/bsd/drm/dist/bsd-core/via_dmablit.c:723
> [ ... ]
>> (gdb) print blitq->dev
>> $26 = (struct drm_device *) 0xc3652400
>> (gdb) print blitq->cur_blit_handle
>> $27 = 0
>> (gdb) print blitq->done_blit_handle
> BTW, you could probably use "print *blitq" here, to view the whole
> structure.  if you "set print pretty" first, it will be readable ;)
>> Is there anything more I can provide? Thank you.
> at this point, i suspect that we have a locking issue and the
> first step here is normally to build a kernel with LOCKDEBUG
> enabled and see what happens.  it hopefully will crash earlier
> and with more information about what is wrong.
> thanks!
> .mrg.

