NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/53124 (FFS is slow because pmap_update doesn't scale)



On Sat, Mar 24, 2018 at 07:10:03PM +0000, maxv%NetBSD.org@localhost wrote:

> My guess is that the 'pmap_update' you're talking about actually
> touches a user pmap, and not pmap_kernel.

ubc_alloc touches pmap_kernel when deleting old mappings in the UBC
window. It is this call that is slow:

	/*
	 * Mapping must be removed before the list entry,
	 * since there is a race with ubc_purge().
	 */
	if (umap->flags & UMAP_MAPPING_CACHED) {
			umap->flags &= ~UMAP_MAPPING_CACHED;
			mutex_enter(oobj->vmobjlock);
			pmap_remove(pmap_kernel(), va,
				va + ubc_winsize);
			pmap_update(pmap_kernel());
			mutex_exit(oobj->vmobjlock);
	}


You can easily mitigate the effect by increasing UBC_WINSHIFT, i.e.
increasing ubc_winsize, for sequential reads.


> By using N-1 user threads, you are forcing a kern-lwp -> user-lwp
> transition on each core, and after that your pmap does not need to be
> synchronized there anymore; so the latency disappears.

Apparently the kernel pmap update isn't synchronized either.


> But this guess would have to be verified. You should probably try to
> assign your program to a given core - and this, early, _before_ your
> program starts doing heavy stuff. schedctl, or pset would be even
> better. If I'm right, it should "fix" the slowdown.

Why would the synchronization with other CPUs go away?

But no, binding the dd process to a single cpu doesn't change anything.

N.B. even putting cpus offline doesn't change anything.


Greetings,
-- 
                                Michael van Elst
Internet: mlelstv%serpens.de@localhost
                                "A potential Snark may lurk in every tree."


Home | Main Index | Thread Index | Old Index