NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
The following reply was made to PR kern/57558; it has been noted by GNATS.
From: Frank Kardel <kardel%netbsd.org@localhost>
To: gnats-bugs <gnats-bugs%NetBSD.org@localhost>
Cc:
Subject: Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
Date: Sun, 5 May 2024 12:50:31 +0200
On 05/05/24 10:13, matthew green wrote:
> just to clear up what i think is a confusion.
>
> freetarg has nothing to do with KVA.
Correct - that is why we are running into the issue as ZFS currently
looks only on freetarg.
> that's about free memory
> (actual physical pages). KVA is a space that gets allocated out
> of, and sometimes that space has real pages backing it but not
> always (and sometimes the same page may be mapped more the once.)
> on 64-bit platforms, KVA is generally *huge*.
Yes.
>
> KVA starvation shouldn't happen -- we have many terabytes
> available for the KVA on amd64, and pool reclaim only happens
> when we run low on free pages.
Well according the the code starvation happens as in uvm/uvm_km
.c
bool
uvm_km_va_starved_p(void)
{
vmem_size_t total;
vmem_size_t free;
if (kmem_arena == NULL)
return false;
total = vmem_size(kmem_arena, VMEM_ALLOC|VMEM_FREE);
free = vmem_size(kmem_arena, VMEM_FREE);
return (free < (total / 10));
}
returns true. It may not be the starvation you have in mind (out of vmem
address space), but as it returns true
it affects the uvm_pageout pagedaemon process.
The difference in semantics here may be the available KVA address space
being HUGE and
the presumably much smaller vmem_size(kmem_arena, VMEM_ALLOC|VMEM_FREE)
value which is
the basis for the uvm_km_va_starved_p() predicate.
> KVA on amd64 already is large
> enough to map all of physical memory in the 'direct map' region,
> as well as other places as needed.
>
> (it sounds like zfs needs to be able to reclaim pages like other
> consumers?)
Yes, many other consumers(maybe even all non ZFS consumers) give up idle
pages(+ maybe even more) when asked to.
ZFS pool memory is currently only reclaimed when we fall below
uvmexp.freetarg. starvartion is signaled long before that.
I think the ZFS reclaim strategy is not in line with the general pool
reclaim expectations
It is also not synchronous as if arc_reclaim is triggered only a thread
starts the cleanup evicting pages. With swap space this is not overly
critical.
> .mrg.
With this issue we need to look past design ideas and known invariants
and look at the implementation and find where the design ideas and
invariants do
not match in the implementation.
-Frank
Home |
Main Index |
Thread Index |
Old Index