NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)



see below.

On 04/26/24 16:20, Michael van Elst wrote:
The following reply was made to PR kern/57558; it has been noted by GNATS.

From: mlelstv%serpens.de@localhost (Michael van Elst)
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
Date: Fri, 26 Apr 2024 14:19:37 -0000 (UTC)

  kardel%netbsd.org@localhost (Frank Kardel) writes:
>Observed behavior:
  >pagedaemon runs at 100% (alreads at 550 minutes CPU and counting)
I have local patches to avoid spinning. As the page daemon keeps
  data structures locked and runs at maximum priority, it prevents
  other tasks to release resources. That's only a little help,
  if the page daemon really cannot free anything, the system is still
  locked up to some degree.
In this situation it spins only because the KVA starvation condition is met. That triggers the pooldrain thread which in turn attempts to drain the pools. The ZFS pools call into arc.c:hdr_recl() which triggers the arc_reclaim_thread. In this scenario arc_availablemem() returns a positive value and thus the reclaim thread does not
reclaim anything.
So the pagedaemon loops but nothing improves as long
as arc_availmem returns positive values. At this point the zfs statistics lists
large amounts of evictable data.
The main reason, that the page daemon cannot free memory is
  that vnodes are not drained. This keeps the associated pools
  busy and buffers allocated by the file cache.
Well, this may not be the reason in this case - there is not even an attempt
made to let the reclaim thread evict data and drain pools. It doesn't get that far.
Of course, if ZFS isn't throttled (and it would be less so, if
  others make room), it would just expunge the rest of the system
  data, so any improvement here just shifts the problem.
Well ZFS currently likes to eat up all pool memory in this situation.
>Once uvm_availmem() falls below uvmexp.freetarg the pagedaemon unwedges
  >from its tight loop as ZFS finally gives up its stranglehold on pool
  >memory..
I locally added some of the arc tunables to experiment with the
  free_target value. The calculation of arc_c_max in arc.c also
  doesn't agree with the comments.
Later ZFS versions did change a lot in this area. Anything we
  correct, might need to be redone, when we move to a newer ZFS
  code base.
Yes, but currently we seem to have a broken ZFS at least on large memory environments. Effects I observed
are:
    - this bug: the famous looping pagedaemon
    - PR kern/58198: ZFS can lead to UVM kills (no swap, out of swap)
- I am still trying to find out what causes the whole system to slow down to a crawl. but that that point the system is unusable to gather any information.

So something needs to be done.


Home | Main Index | Thread Index | Old Index