NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)



The following reply was made to PR kern/57558; it has been noted by GNATS.

From: Frank Kardel <kardel%netbsd.org@localhost>
To: gnats-bugs <gnats-bugs%NetBSD.org@localhost>
Cc: 
Subject: Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
Date: Sun, 5 May 2024 12:50:31 +0200

 On 05/05/24 10:13, matthew green wrote:
 
 > just to clear up what i think is a confusion.
 >
 > freetarg has nothing to do with KVA.
 Correct - that is why we are running into the issue as ZFS currently 
 looks only on freetarg.
 >    that's about free memory
 > (actual physical pages).  KVA is a space that gets allocated out
 > of, and sometimes that space has real pages backing it but not
 > always (and sometimes the same page may be mapped more the once.)
 > on 64-bit platforms, KVA is generally *huge*.
 Yes.
 >
 > KVA starvation shouldn't happen -- we have many terabytes
 > available for the KVA on amd64, and pool reclaim only happens
 > when we run low on free pages.
 Well according the the code starvation happens as in uvm/uvm_km
 .c
 bool
 uvm_km_va_starved_p(void)
 {
          vmem_size_t total;
          vmem_size_t free;
 
          if (kmem_arena == NULL)
                  return false;
 
          total = vmem_size(kmem_arena, VMEM_ALLOC|VMEM_FREE);
          free = vmem_size(kmem_arena, VMEM_FREE);
 
          return (free < (total / 10));
 }
 
 returns true. It may not be the starvation you have in mind (out of vmem 
 address space), but as it returns true
 it affects the uvm_pageout pagedaemon process.
 
 The difference in semantics here may be the available KVA address space 
 being HUGE and
 the presumably much smaller vmem_size(kmem_arena, VMEM_ALLOC|VMEM_FREE) 
 value which is
 the basis for the uvm_km_va_starved_p() predicate.
 >    KVA on amd64 already is large
 > enough to map all of physical memory in the 'direct map' region,
 > as well as other places as needed.
 >
 > (it sounds like zfs needs to be able to reclaim pages like other
 > consumers?)
 Yes, many other consumers(maybe even all non ZFS consumers) give up idle 
 pages(+ maybe even more) when asked to.
 ZFS pool memory is currently only reclaimed when we fall below 
 uvmexp.freetarg. starvartion is signaled long before that.
 I think the ZFS reclaim strategy is not in line with the general pool 
 reclaim expectations
 It is also not synchronous as if arc_reclaim  is triggered only a thread 
 starts the cleanup evicting pages. With swap space this is not overly 
 critical.
 > .mrg.
 With this issue we need to look past design ideas and known invariants 
 and look at the implementation and find where the design ideas and 
 invariants do
 not match in the implementation.
 
 -Frank
 


Home | Main Index | Thread Index | Old Index