Re: kern/54727: writing a large file causes unreasonable system behaviour

On Mon, Dec 09, 2019 at 11:25:01AM +0000, J. Hannken-Illjes wrote:

>  This happens for me too.  Looks like:
>  - we are low on memory.
>  - nearly all pages are active and belong to ONE vnode (the large file
>    we are currently creating).
>  - pagedaemon ends up in uvmpdpol_balancequeue() to increase the number
>    of inactive pages.
>  - often the one vnode v_interlock is held by another thread
>    so uvmpd_trylockowner(p) fails for nearly all active pages.
>  - the pagedaemon starts busy looping until it finds this vnode
>    unlocked and everything proceeds.

I agree with this assessment.  It looks very likely to me.  In addition to
v_interlock being busy simply due to legitimate activity, with more than one
CPU the lock pressure here can be immense because of the pagedaemon itself:

	pagedaemon holds uvm_pageqlock and is pondering deeply

		-> tries to acquire v_interlock in relation to
		   a specific single page (reasonable enough thing
		   to try), keeps retrying

	busily writing process holds v_interlock

		-> is already waiting on uvm_pageqlock for uvm_page*(),
		   in relation to a totally different page (unreasonable)

The UVM locking changes I have may help with this, because it may be
possible for for the pagedaemon to drop the replacement for uvm_pageqlock
while it is trying to acquire the uobject lock.  In other words, for the
duration of the reverse locking dance, the pagedaemon's bothersome
activities would be confined to a single page, which would allow the writing
process a chance to continue its work with other pages and drop the uobject
lock.  Those will hopefully be ready for review later today.

(That's no use for 9.0 though.)


