Re: Unexpected out of memory kills when running parallel find instances over millions of files

To: Michael van Elst <mlelstv%serpens.de@localhost>
Subject: Re: Unexpected out of memory kills when running parallel find instances over millions of files
From: Mateusz Guzik <mjguzik%gmail.com@localhost>
Date: Thu, 19 Oct 2023 19:20:40 +0200

On Thu, Oct 19, 2023 at 10:49:37AM +0000, Michael van Elst wrote:
> mjguzik%gmail.com@localhost (Mateusz Guzik) writes:
> 
> >Running 20 find(1) instances, where each has a "private" tree with
> >million of files runs into trouble with the kernel killing them (and
> >others):
> >[   785.194378] UVM: pid 1998.1998 (find), uid 0 killed: out of swap
> 
> 
> >This should not be happening -- there is tons of reusable RAM as
> >virtually all of the vnodes getting here are immediately recyclable.
> 
> While vnodes would be recyclable, they hardly get recycled unless
> an filesystem object is deleted or the filesystem is unmounted.
> 

They get recycled all the time by vdrain thread if numvnodes goes above
desiredvnodes, like it does in this test.

Part of the problem is that with 20 processes walking the filesystem it
gets outpaced (20:1) and has no means of stopping it, while the memory
allocator just keeps handing objects out.

> >Specs are 24 cores, 24G of RAM and ufs2 with noatime. swap is *not* configured.
> 
> Without swap, the kernel also has no chance to evict process pages
> to grow the vnode cache further.
> 

It should not be trying to grow the vnode cache. If anything it should
stop it from blowing out of proportion and definitely should not kill
processes in presence of swaths of immediately freeable vnodes.

As noted above there is code to try to do it, but it is not sufficient.

I tested several systems (Linux, all the BSDs and even Illumos) and only
NetBSD fails to complete the run. That is to say even OpenBSD chugs
along no problem.

This is definitely a reliability problem in the kernel.

Traditionally vnode allocation would recycle something from the "free"
list if need be. Perhaps restorign this behavior is the easiest way out
for the time being.

Follow-Ups:
- Re: Unexpected out of memory kills when running parallel find instances over millions of files
  - From: Michael van Elst

Prev by Date: Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes
Next by Date: Re: Unexpected out of memory kills when running parallel find instances over millions of files
Previous by Thread: Re: Unexpected out of memory kills when running parallel find instances over millions of files
Next by Thread: Re: Unexpected out of memory kills when running parallel find instances over millions of files
Indexes:

Home | Main Index | Thread Index | Old Index