NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/57558: pgdaemon 100% busy - no scanning (ZFS case)



>Number:         57558
>Category:       kern
>Synopsis:       pgdaemon 100% busy - no scanning (ZFS case)
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Aug 03 08:45:00 +0000 2023
>Originator:     Frank Kardel
>Release:        NetBSD 10.0_BETA / current
>Organization:
	
>Environment:
	
	
System: NetBSD Marmolata 10.0_BETA NetBSD 10.0_BETA (XEN3_DOM0) #1: Thu Jul 27 18:30:30 CEST 2023 kardel@gaia:/src/NetBSD/n10/src/obj.amd64/sys/arch/amd64/compile/XEN3_DOM0 amd64
Architecture: x86_64
Machine: amd64
>Description:
	It has been observed that pgdaemon can get into a tight loop
	consuming 100% cpu, not yielding and blocking other RUNable
	threads on the cpu. (PRs kern/56516 (for effects - may have other cause), kern/55707)
	This analysis and proposed fix relate to the pgdaemon loop cause by KVA exhaustion
	by ZFS.

	Observed and analyzed in following environment (should be reproducable in simpler
	environments):
		XEN3_DOM0 providing vnd devices based on files on ZFS.
		GENERIC(pvh) DOMU using an ffs filesystem based in the vnd in XEN3_DOM0.

	Observed actions/effects:
		1) running a database on the ffs file system in XEN3_DOM0
		2) load a larger database
		3) XEN3_DOM0 is fine until ZFS allocated 90% of KVA
		   at this point pgdaemon kicks in and enters a tight loop.
		4) pgdaemon does not do page scans (enough memory is available)
		5) pgdaemon loops as uvm_km_va_starved_p() returns true
		6) pool_drain is unable the reclaim any idle pages from the pools
		7) uvm_km_va_starved_p() thus keeps returning true - pgdaemon keeps looping

	Analyzed causes:
		- pool_drain causes upcalls to ZFS reclaim logic
		- ZFS reclaim logic does no reclaim anything as the current code
		  looks at uvm_availmem(false) and that returns 'plenty' memory)
		  thus no attempt to free memory on ZFS is done and no KVA is reclaimed.

	Conclusion:
		- using uvm_availmem(false) for ZFS memory throtteling is wrong as
		  ZFS memory is allocated from kmem KVA pools.
		- ZFS arc must to use KVA memory for memory checks

>How-To-Repeat:
	run a DB load in an FFS from a vnd of a file on ZFS.
>Fix:
	Patch 1:
		let ZFS use a correct view on KVA memory:
		With this patch arc reclaim now detects memory shortage and
		frees pages. Also the ZFS KVA used by ZFS is limited to
		75% KVA - could be made tunable

	Patch 1 is not sufficient though. arc reclaim thread kicks in at 75%
	correctly, but pages are not fully reclaimed and ZFS depletes its cache
	fully as the freed and now idle page are not reclaimed from the pools yet.
	pgdaemon will now not trigger pool_drain, as uvm_km_va_starved_p() returns false
	at this point.

	To reclaim the pages freed directly we need
	Patch 2:
		force page reclaim
	that will perform the reclaim.

	With both fixes the arc reclaim thread kicks in at 75% KVA usage and
	reclaim only enough memory to no to exceed 75% KVA.

	Any comments?

	OK to commit? (happens automatically on no feedback)

Index: external/cddl/osnet/dist/uts/common/fs/zfs/arc.c
===================================================================
RCS file: /cvsroot/src/external/cddl/osnet/dist/uts/common/fs/zfs/arc.c,v
retrieving revision 1.22
diff -c -u -r1.22 arc.c
--- external/cddl/osnet/dist/uts/common/fs/zfs/arc.c	3 Aug 2022 01:53:06 -0000	1.22
+++ external/cddl/osnet/dist/uts/common/fs/zfs/arc.c	3 Aug 2023 08:19:11 -0000
@@ -276,6 +276,7 @@
 #endif /* illumos */
 
 #ifdef __NetBSD__
+#include <sys/vmem.h>
 #include <uvm/uvm.h>
 #ifndef btop
 #define	btop(x)		((x) / PAGE_SIZE)
@@ -285,9 +286,9 @@
 #endif
 //#define	needfree	(uvm_availmem() < uvmexp.freetarg ? uvmexp.freetarg : 0)
 #define	buf_init	arc_buf_init
-#define	freemem		uvm_availmem(false)
+#define	freemem		btop(vmem_size(kmem_arena, VMEM_FREE))
 #define	minfree		uvmexp.freemin
-#define	desfree		uvmexp.freetarg
+#define	desfree		(btop(vmem_size(kmem_arena, VMEM_ALLOC|VMEM_FREE)) / 4)
 #define	zfs_arc_free_target desfree
 #define	lotsfree	(desfree * 2)
 #define	availrmem	desfree


	Patch 2:
		force reclaiming of pages on affected pools

Index: external/cddl/osnet/sys/kern/kmem.c
===================================================================
RCS file: /cvsroot/src/external/cddl/osnet/sys/kern/kmem.c,v
retrieving revision 1.3
diff -c -u -r1.3 kmem.c
--- external/cddl/osnet/sys/kern/kmem.c	11 Nov 2020 03:31:04 -0000	1.3
+++ external/cddl/osnet/sys/kern/kmem.c	3 Aug 2023 08:19:11 -0000
@@ -124,6 +124,7 @@
 {
 
 	pool_cache_invalidate(km->km_pool);
+	pool_cache_reclaim(km->km_pool);
 }
 
 #undef kmem_alloc

>Unformatted:
 	
 	


Home | Main Index | Thread Index | Old Index