NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
The following reply was made to PR kern/57558; it has been noted by GNATS.
From: Frank Kardel <kardel%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
Date: Thu, 3 Aug 2023 20:22:10 +0200
Hi Chuck !
Thanks for looking into that.
I came up with the first patch due to pgdaemon looping due to
uvm_km_va_starved_p() being true.
vmstat -m shows the statistics of the pools is summary close to
32Gb my DOM0 has.
counting the conditions when the pgdaemon is looping gives
/var/log/messages.0.gz:Jul 28 17:42:41 Marmolata /netbsd: [
9789.2242179] pagedaemon: loops=16026699, cnt_needsfree=0,
cnt_needsscan=0, cnt_drain=16026699, cnt_starved=16026
699, cnt_avail=16026699, fpages=337385
/var/log/messages.0.gz:Jul 28 17:42:41 Marmolata /netbsd: [
9795.2244437] pagedaemon: loops=16024007, cnt_needsfree=0,
cnt_needsscan=0, cnt_drain=16024007, cnt_starved=16024
007, cnt_avail=16024007, fpages=335307
/var/log/messages.0.gz:Jul 28 17:42:41 Marmolata /netbsd: [
9801.2246381] pagedaemon: loops=16031141, cnt_needsfree=0,
cnt_needsscan=0, cnt_drain=16031141, cnt_starved=16031
141, cnt_avail=16031141, fpages=335331
uvm_km_va_starved_p(void)
{
vmem_size_t total;
vmem_size_t free;
if (kmem_arena == NULL)
return false;
total = vmem_size(kmem_arena, VMEM_ALLOC|VMEM_FREE);
free = vmem_size(kmem_arena, VMEM_FREE);
return (free < (total / 10));
}
int
uvm_availmem(bool cached)
{
int64_t fp;
cpu_count_sync(cached);
if ((fp = cpu_count_get(CPU_COUNT_FREEPAGES)) < 0) {
/*
* XXXAD could briefly go negative because it's impossible
* to get a clean snapshot. address this for other
counters
* used as running totals before NetBSD 10 although less
* important for those.
*/
fp = 0;
}
return (int)fp;
}
So, while uvm_km_va_starved_p() considers almost all memory used up
uvm_availmem(false) returns 337385 free pages (~1.28 Gb) well above
uvmexp.freetarg.
So, why do we count so many free pages when the free vmem for kmem_arena
is less than 10% of the total kmem_arena?
Maybe the pool pages have been allocated but not yet been referenced - I
didn't look that deep into the vmen/ZFS interaction.
I understand the reasoning why .kmem size = phymem size should have worked
There are still inconsistencies, though.
Even if uvm_availmem(false) would account for all pages
allocated/reserved in the kmem_arena vmem on the 32Gb system the actual
freetarget is 2730 free pages (~10.7 Mb).
%10 of 32Gb would be 3.2Gb which is a multiple of the free pages target.
So even then we would be stuck with a looping page daemon.
I think we need to find a better way for coping with with the accounting
differences between vmem/uvm free pages. Looking at the vmem statistics
seemed logical to me as ZFS allocates almost everything from kmem_arena
via pools.
I don't know what vmem does when there are less physical pages available
that the vmem allocation would require. This was the case you tried to
avoid.
So, looking at vmen statistic seems to be consistent with the starved
flag logic - that is why it does not trigger the looping pgdaemon. What
isn't
covered is the case of less physical pages than the pool allocation
required.
I think we have yet to find a correct, robust solution that does not
trigger the pgdaemon almost infinite loop.
Frank
On 08/03/23 18:30, Chuck Silvers wrote:
> The following reply was made to PR kern/57558; it has been noted by GNATS.
>
> From: Chuck Silvers <chuq%chuq.com@localhost>
> To: gnats-bugs%netbsd.org@localhost
> Cc:
> Subject: Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
> Date: Thu, 3 Aug 2023 09:27:50 -0700
>
> On Thu, Aug 03, 2023 at 08:45:01AM +0000, kardel%netbsd.org@localhost wrote:
> > Patch 1:
> > let ZFS use a correct view on KVA memory:
> > With this patch arc reclaim now detects memory shortage and
> > frees pages. Also the ZFS KVA used by ZFS is limited to
> > 75% KVA - could be made tunable
> >
> > Patch 1 is not sufficient though. arc reclaim thread kicks in at 75%
> > correctly, but pages are not fully reclaimed and ZFS depletes its cache
> > fully as the freed and now idle page are not reclaimed from the pools yet.
> > pgdaemon will now not trigger pool_drain, as uvm_km_va_starved_p() returns false
> > at this point.
>
> this patch is not correct. it does not do the right thing when there
> is plenty of KVA but a shortage of physical pages. the goal with
> previous fixes for ZFS ARC memory management problems was to prevent
> KVA shortages by making KVA big enough to map all of RAM, and thus
> avoid the need to consider KVA because we would always run low on
> physical pages before we would run low on KVA. but apparently in your
> environment that is not working. maybe we do something differently in
> a XEN kernel that we need to account for?
>
>
> > To reclaim the pages freed directly we need
> > Patch 2:
> > force page reclaim
> > that will perform the reclaim.
>
> this second patch is fine.
>
> -Chuck
>
Home |
Main Index |
Thread Index |
Old Index