Port-arm archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: uvm/busy page deadlock in current (related to loading Raspberry Pi 3B+ Wi-Fi firmware, but more of a timing bug with the VM system)



>> 2) Change the read-ahead code to short-circuit if uvm.adiodone_queue is NULL.
> 
> That should fix the bug.

It certainly should make this particular case go away.

But I also feel like it's hiding a misbehavior that's hard to diagnose.  I've been curious about a few other options:

1) Perhaps genfs_getpages_read should return an error if it's been called async but async isn't available?  that would signal to the caller (there's only one -- genfs_getpages) that something went wrong and it should deal with it.  But it would likely need to be a special error...

2) Actually, what if we moved the check ( uvm.aiodone_queue == NULL ) and consequent reset of async to false into genfs_getpages itself?  If that happened, then "async" would be false in the right place, and the code that's in place to free the pages should run.  Except...

3) But it doesn't look like that code will mark them "unbusy" actually gets hit.  The code in genfs_getpages only marks pages "unbusy" if they're outside the range:

		if (i < ridx || i >= ridx + orignmempages || async) {

and these pages are all actually inside the range.  So they don't get "unbusied".

I think part of my problem is that I don't understand where, even when the "async==false" case works, "un-busying" of pages is supposed actually done.  But THAT code needs to run in the "async gets switched to true" case for the bug to really be eradicated.

>> 3) Start "aiodone_queue" earlier in the sequence.
> 
> There are other dependencies that make this difficult. Currently it is like:
> 
> mount root -> start pageout daemon -> start filesystem syncing -> start AIO thread -> wait for config threads -> release init
>           \->    run config threads that depend on mounted root in parallel      -/

Which kind of task is the loading of the broadcom firmware?  Is it in a config thread?  Is there a way for a given config threads to wait for AIO to start and then proceed?

>> I also don't know why others aren't seeing this -- although maybe they are.  I don't know how many people are trying out Wi-Fi on the 3B+.  Without the firmware files in the tree, maybe no ones really doing much with them yet.
> 
> The usual boot process from SD card seems to be too slow to trigger this.

I guess so.  But I was a little confused since all this stuff was happening synchronously before AIO was ready -- and it seemed like that would be true on SD as well...

Thank you for your help.

Rob



Home | Main Index | Thread Index | Old Index