NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
Frank Kardel <kardel%netbsd.org@localhost> writes:
[snip]
> TLDR:
> - pagedaemon aggressively starts pool darining once KVA free falls below 10%
> - ZFS won't free pool pages until free memory falls below uvmexp.freetarg.
> - there is a huge gap between uvmexp.freetarg and 10% KVA free
> increasing with larger memory(10%)
> - while below 10% KVA free ZFS eventually depletes all other pools that
> are cooperatively giving up pages
> causing all sorts of shortages in other areas (visible in e.g.
> network buffers)
This is a pretty good description of a problem I am/was seeing with the
daily cron checking for core files. On a DOMU with not a lot of memory,
12GB - 16GB and a WHOLE lot of ZFS filesets, this job would never
complete and the guest would appear to lock up (actually it may be any
job that did "find" that crossed into a ZFS fileset). To work around it
I ended up commenting out the daily job. The guest is my build system
for the OS and it would also start to bog down and would eventually hang
up after a few OS builds, but that was a more manageable situation.
With the simple kardel patch that was provided, the daily job could run
to completion and the system appears to be responsive after a couple of
days. I have not had time to run builds to see how that effects the
matter. The guest has 2 vcpus and I sometimes would abuse it pretty
hard by running 3 builds with -j2 on the build.sh line at the same time.
Very often the system would hang up at some point if I did this and I
had to back off and only run 1 or 2 at the same time.
> Mitigation: allow ZFS to detect free KVA memory falling below 10% to
> start reclaiming memory.
>
> It is not related to XEN at all. Just ZFS + large memory is sufficient
> for the problems to occur.
> Base issue is the big difference between 10% free KVA memory limit and
> uvmexp.freetarg.
I am not sure that "large memory" needs to be all that large to prompt
the problem. The description of what happens when ZFS gobbles
everything up is pretty close to what I am seeing...
> I seem to explain the mechanism over and over again. And so far no one
> has verified this analysis.
>
> -Frank
>
--
Brad Spencer - brad%anduin.eldar.org@localhost - KC8VKS - http://anduin.eldar.org
Home |
Main Index |
Thread Index |
Old Index