Re: uvm/busy page deadlock in current (related to loading Raspberry Pi 3B+ Wi-Fi firmware, but more of a timing bug with the VM system)

Thanks very much, Chuck -- this patch fixed my problem.

I noticed you removed a couple of KASSERTs -- shouldn't those be cases be EVEN MORE true now than they were before?  Given what I debugged, I'm wondering if the asserts would help make sure future code doesn't end up trying to do something similar in the future...


> On Feb 7, 2020, at 4:31 PM, Chuck Silvers <> wrote:
> On Thu, Feb 06, 2020 at 04:31:47PM -0800, Rob Newberry wrote:
>> Hi.
>> I spent last weekend -- and a few days this week -- tracking down a problem that exists in current.
>> I found a workaround, but I don't know what the "proper" fix is.
>> Digging through the VM layer and debugging with printfs was slow --
>> and it's a boot-time issue, so I had to swap a lot of SD cards back and forth :-).
>> Hopefully someone here is better at this than me.
>> [analysis...]
> good job working your way through all that, this code is pretty complicated.
>> 3) Start "aiodone_queue" earlier in the sequence.  I don't have a rich enough understanding of
>> this part of the kernel and user land startup process to know how hard this is, or how hacky it is.
> this is the right way to fix it.  please try the attached patch.
>> BTW, I'm ASSUMING that if uvm.aiodone_queue were present, the asynchronous completion would somehow
>> handle marking the pages as "not busy".  But I actually never debugged that code path,
>> so I can't be sure that's helpful.
> right, the "aiodone_queue" workqueue will call uvm_aiodone_worker() on the buffer,
> and bp->b_iodone will have been set to uvm_aio_aiodone, which unbusies the pages
> among other things.
> -Chuck
