NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)



The following reply was made to PR kern/57558; it has been noted by GNATS.

From: Brad Spencer <brad%anduin.eldar.org@localhost>
To: Frank Kardel <kardel%netbsd.org@localhost>
Cc: gnats-bugs%netbsd.org@localhost, kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
        netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/57558: pgdaemon 100% busy - no scanning (ZFS case)
Date: Sun, 05 May 2024 15:56:13 -0400

 Frank Kardel <kardel%netbsd.org@localhost> writes:
 
 
 > Thanks for your observation. Actually "large memory" could be seen more 
 > like where
 > vmem_size(kernel_arena, VMEM_ALLOC|VMEM_FREE) / 10 in pages being 
 > significantly larger than uvmexp.freetarg.
 > As you have observed this can already happen on smaller systems.
 >
 > -Frank
 
 
 Sure...
 
 I was able to perform the abusive build operation and was able to make
 the system fall over.  The abuse is the following:
 
 Have a 10.0 PVH guest with 16GB and 2vcpus.  Run the following builds at
 the same time:
 
 build.sh -j2 <- for amd64
 build.sh -j2 <- for i386
 build.sh -j2 <- for earmv7hf
 
 The source tree is in a ZFS fileset and is used by all of the builds.
 The artifacts (obj, dist, release, etc..) are all in their own ZFS
 filesets for each of the arch types (that is /artifacts/amd64 would be
 its own ZFS filesystem and contain object, release and dist
 subdirectories, using the -O, -R and -D flags to build.sh to point to
 /artifacts/amd64/OBJ and etc.  There would also be a /artifacts/i386 and
 /artifacts/earmv7hf which are also theirs own filesets).
 
 Everything will humming along just fine, until the earmv7hf build nears
 the end and does /usr/src/distrib/utils/embedded/mkimage which does "dd
 bs=1 count=4456448 if=/dev/zero" ... that dd will run with high CPU for
 a little bit and then cause all active reads and writes going on with
 the other builds and itself to more or less deadlock.  The CPU
 utilization will fall to zero and disk utilization on the zpool will
 fall to zero.  The system will be responsive, but if you try hitting any
 of the files being used the command (ls, or whatever) will hang up.
 
 As far as I can tell what was going on in the system was two objcopy and
 two rm along with the dd.  One objcopy was stuck in tstile and the other
 in &zilog.  The dd was stuck in &tx->t and both rm were stuck in &zio->
 ... all according to top.
 
 I can almost reproduce this on demand, as long as the amd64 and i386
 builds are actually building something and the earmv7hf build hits the
 mkimage call at the same time.  A clean build of all three will probably
 provoke it and update builds (-u flag to build.sh) may as well.
 
 This is all probably unrelated to the patch that was provided and the
 problem being reported.  The patch does appear to make the situation
 better.
 
 Might want to consider switching the arguments to that "dd" to be "dd
 count=1 bs=4456448 if=/dev/zero" .. that is just write one block of
 4456448 bytes instead of 4456448 one byte blocks.  Might be less
 stressful.
 
 
 
 
 
 -- 
 Brad Spencer - brad%anduin.eldar.org@localhost - KC8VKS - http://anduin.eldar.org
 


Home | Main Index | Thread Index | Old Index