tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Unexpected out of memory kills when running parallel find instances over millions of files



On Fri, Oct 20, 2023 at 10:26:05PM +0200, Reinoud Zandijk wrote:
> Hi,
> 
> On Thu, Oct 19, 2023 at 11:20:02AM +0200, Mateusz Guzik wrote:
> > Running 20 find(1) instances, where each has a "private" tree with
> > million of files runs into trouble with the kernel killing them (and
> > others):
> > [   785.194378] UVM: pid 1998.1998 (find), uid 0 killed: out of swap
> > [   785.194378] UVM: pid 2010.2010 (find), uid 0 killed: out of swap
> > [   785.224675] UVM: pid 1771.1771 (top), uid 0 killed: out of swap
> > [   785.285291] UVM: pid 1960.1960 (zsh), uid 0 killed: out of swap
> > [   785.376172] UVM: pid 2013.2013 (find), uid 0 killed: out of swap
> > [   785.416572] UVM: pid 1760.1760 (find), uid 0 killed: out of swap
> > [   785.416572] UVM: pid 1683.1683 (tmux), uid 0 killed: out of swap
> > 
> > This should not be happening -- there is tons of reusable RAM as
> > virtually all of the vnodes getting here are immediately recyclable.
> > 
> > $elsewhere I got a report of a workload with hundreds of millions of
> > files which get walked in parallel -- a number high enough that it
> > does not fit in RAM on boxes which run it. Out of curiosity I figured
> > I'll check how others are doing on the front, but key is that this is
> > not a made up problem.
> 
> I can second that. I have had UVM killing my X11 when visiting millions of
> files; it might have been using rump but I am not sure.
> 
> What struck me was that swap was maxed out but systat showed something like
> 40gb as `File'. I haven't looked at the Meta percentage but it wouldn't
> surpise me if that was also high. Just some random snippet:

I've seen it too, although it didn't end up killing processes.
But the nightly jobs (usual daily/security+ backup) ends up pushing to
swap lots of processes, while the file cache grows to more than half the
RAM (I have 16Gb). As a result the machine is really slow and none of the
nightly jobs complete before morning.

Decreasing kern.maxvnodes helps a lot.

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index