Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: uvm/busy page deadlock in current (related to loading Raspberry Pi 3B+ Wi-Fi firmware, but more of a timing bug with the VM system)



Thanks very much, Chuck -- this patch fixed my problem.

I noticed you removed a couple of KASSERTs -- shouldn't those be cases be EVEN MORE true now than they were before?  Given what I debugged, I'm wondering if the asserts would help make sure future code doesn't end up trying to do something similar in the future...

Rob





> On Feb 7, 2020, at 4:31 PM, Chuck Silvers <chuq%chuq.com@localhost> wrote:
> 
> On Thu, Feb 06, 2020 at 04:31:47PM -0800, Rob Newberry wrote:
>> Hi.
>> 
>> I spent last weekend -- and a few days this week -- tracking down a problem that exists in current.
>> I found a workaround, but I don't know what the "proper" fix is.
>> Digging through the VM layer and debugging with printfs was slow --
>> and it's a boot-time issue, so I had to swap a lot of SD cards back and forth :-).
>> Hopefully someone here is better at this than me.
>> 
>> 
>> [analysis...]
> 
> good job working your way through all that, this code is pretty complicated.
> 
> 
>> 3) Start "aiodone_queue" earlier in the sequence.  I don't have a rich enough understanding of
>> this part of the kernel and user land startup process to know how hard this is, or how hacky it is.
> 
> this is the right way to fix it.  please try the attached patch.
> 
> 
>> BTW, I'm ASSUMING that if uvm.aiodone_queue were present, the asynchronous completion would somehow
>> handle marking the pages as "not busy".  But I actually never debugged that code path,
>> so I can't be sure that's helpful.
> 
> right, the "aiodone_queue" workqueue will call uvm_aiodone_worker() on the buffer,
> and bp->b_iodone will have been set to uvm_aio_aiodone, which unbusies the pages
> among other things.
> 
> -Chuck
> <diff.aiodone_queue.1>



Home | Main Index | Thread Index | Old Index