NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/50448: Xorg on 82G33/G31 Express Integrated Graphics Controller hangs



   Date: Sun, 21 Feb 2016 22:55:01 +0000 (UTC)
   From: Greg Oster <oster%netbsd.org@localhost>

   Anything I can do to help with this?  I just tried to login to my
   desktop after a reboot, and it 'hung' again...
   Later...

Hi, Greg!  Sorry to have been so quiet about this -- my time for
focussed debugging has been pretty limited.

One immediate workaround you can try, just to make the machine useful
to you again right now, is to disable the new DRM/KMS code and
re-enable the old DRM(/UMS) code.  Something like this:

i915drm* at drm?

no i915drmkms*
no intelfb*
no radeon*
no radeondrmkmsfb*
no nouveau*
no nouveaufb*

This hasn't undergone much testing, but it is fairly likely that most
of that code hasn't broken.


I have two general hypotheses about what's going on here:


1. The i915drmkms code has allocated too many pages to graphics
buffers, and we haven't hooked up the mechanism by which the page
daemon can tell i915drmkms to please relinquish a few pages that
aren't terribly important, if it doesn't mind.  Hooking up this
mechanism shouldn't be too hard -- just need to invent a shim for
Linux `shrinkers' and teach the uvm page daemon to invoke it.


2. The code path you quoted involves two locks and a wait that only
releases one of them.  The two locks are the DRM/KMS dev->struct_mutex
and the GEM/UVM object's vmobjlock:

Many paths into the DRM/KMS code path require exclusive access to
whole the driver state, serialized by dev->struct_mutex, and I haven't
checked but it wouldn't surprise me if this one will hold that.

When we do i915_gem_object_get_pages, which calls uvm_obj_wirepages ->
uao_get, if there are no free pages, then we have to wait -- and
although uao_get drops the vmobjlock in order to uvm_wait, it does not
drop dev->struct_mutex.

If it turns out that this code path is serialized by dev->struct_mutex
(which it may not be), then we'd have to find some way to disentangle
it, or some hack to persuade uao_get to drop dev->struct_mutex.


These hints might be enough for someone else to do some analysis or
experiments.  If you'd like to take a look, and want any more
hand-holding, I'd be happy to give more hints.  I won't have time to
prepare specific patches to test for a few days at best, though -- but
feel free to ping me in a week if I haven't piped up again.


Home | Main Index | Thread Index | Old Index