tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Unexpected out of memory kills when running parallel find instances over millions of files



Running 20 find(1) instances, where each has a "private" tree with
million of files runs into trouble with the kernel killing them (and
others):
[   785.194378] UVM: pid 1998.1998 (find), uid 0 killed: out of swap
[   785.194378] UVM: pid 2010.2010 (find), uid 0 killed: out of swap
[   785.224675] UVM: pid 1771.1771 (top), uid 0 killed: out of swap
[   785.285291] UVM: pid 1960.1960 (zsh), uid 0 killed: out of swap
[   785.376172] UVM: pid 2013.2013 (find), uid 0 killed: out of swap
[   785.416572] UVM: pid 1760.1760 (find), uid 0 killed: out of swap
[   785.416572] UVM: pid 1683.1683 (tmux), uid 0 killed: out of swap

This should not be happening -- there is tons of reusable RAM as
virtually all of the vnodes getting here are immediately recyclable.

$elsewhere I got a report of a workload with hundreds of millions of
files which get walked in parallel -- a number high enough that it
does not fit in RAM on boxes which run it. Out of curiosity I figured
I'll check how others are doing on the front, but key is that this is
not a made up problem.

I'm running NetBSD 10, kernel built from this commit at top of the tree:
Author: andvar <andvar%NetBSD.org@localhost>
Date:   Sat Oct 14 08:05:25 2023 +0000

    fix various typos in comments and documentation, mainly in word "between".

Specs are 24 cores, 24G of RAM and ufs2 with noatime. swap is *not* configured.

Test generates 20 separate trees, each has 1000 directories with 1000
files (or 20 million files in total + some dirs).

Repro instructions are here:
https://people.freebsd.org/~mjg/.junk/fstree.tgz

Note that parallel creation of the these trees is dog slow, took over
40 minutes for me.

I had to pass extra flags to newfs to for the target fs to even fit
this inode count:
newfs -n 220000000 -O 2 /dev/wd1e

So the expected outcome is that this finishes (extra points for
reasonable time) instead of having userspace getting killed.

I don't know what kind of diagnostic info would be best here, but
given repro steps above I don't think I need to look for something.
Have fun. :)

-- 
Mateusz Guzik <mjguzik gmail.com>



Home | Main Index | Thread Index | Old Index