Re: kern/54818: 9.0_RC1 pagedaemon spins

To: ad%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost, tsutsui%ceres.dti.ne.jp@localhost
Subject: Re: kern/54818: 9.0_RC1 pagedaemon spins
From: Lars Reichardt <lars%paradoxon.info@localhost>
Date: Wed, 4 Mar 2020 15:35:01 +0000 (UTC)

The following reply was made to PR kern/54818; it has been noted by GNATS.

From: Lars Reichardt <lars%paradoxon.info@localhost>
To: gnats-bugs%netbsd.org@localhost, ad%netbsd.org@localhost, gnats-admin%netbsd.org@localhost,
 netbsd-bugs%netbsd.org@localhost, tsutsui%ceres.dti.ne.jp@localhost
Cc: 
Subject: Re: kern/54818: 9.0_RC1 pagedaemon spins
Date: Wed, 4 Mar 2020 16:32:28 +0100

 On 2020-03-04 12:10, Havard Eidnes wrote:
 > The following reply was made to PR kern/54818; it has been noted by GNATS.
 >
 > From: Havard Eidnes <he%uninett.no@localhost>
 > To: mlelstv%serpens.de@localhost
 > Cc: gnats-bugs%netbsd.org@localhost, ad%netbsd.org@localhost, tsutsui%ceres.dti.ne.jp@localhost
 > Subject: Re: kern/54818: 9.0_RC1 pagedaemon spins
 > Date: Wed, 04 Mar 2020 12:09:22 +0100 (CET)
 ...
 >   The "out of KVA" check fired.
 >   
 >   It was seemingly triggered by the X server; I moved the mouse
 >   between windows and it froze; "top" shows "Xorg" in "vmem"
 >   status, and the kernel started printing
 >   
 >   pagedaemon: Out of KVA, awaiting doom...
 >   
 >   I could log in over the network even though the X server was
 >   wedged, and collect some information -- it follows here below.
 >   If there is other information I should collect, please inform
 >   me.
 >   
 >   The question remains: is there something I can do to prevent this
 >   from happening again?
 >   
 >   
 >   
 >   : {9} ; vmstat -m
 >   Memory resource pool statistics
 >   Name        Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
 >   amappl        80    22924    0      208   455     0   455   455     0   inf    0
 >   anonpl        32  1305084    0   210496  9459     0  9459  9459     0   inf  221
 >   ataspl        96  2337417    0  2337417     1     0     1     1     0   inf    1
 >   biopl        288      931    0      760    55     0    55    55     0   inf   42
 
 <<<<< these don't allocate via kmem but directly from kernel_map
 
 >   buf16k      16384    1411    0     1270   241   205    36    93     1     1    0
 >   buf1k       1024        2    0        2     1     0     1     1     1     1    1
 >   buf2k       2048        9    0        9     5     4     1     5     1     1    1
 >   buf32k      32768  223292    0   193503 92993 73624 19369 37065     1     1    0
 >   buf4k       4096   491370    0   391560 491371 391560 99811 179873  1     1    1
 >   buf64k      65536       4    0        0     5     0     5     5     1     1    1
 >   buf8k       8192     1865    0     1613   160   128    32    63     1     1    0
 >   bufpl        288   210502    0    80506 15026     0 15026 15026     0   inf  111
 
 >>>> these are very interesting:
 
 These are the quantum caches for allocation virtual address space.
 
 No 4k allocation as the direct map is used (that's expected) and most pools have a pool page size of 4k
 but a lot of 64k allocations with the backing pool page size 256k.
 
 That is 64*63924 4091136kb worse of allocations
 (15981 pool pages each 256k)
   
 and no releases at all seems like some leak to me.
 
 Does that happen when starting X?
 Seems to be an intel drmkms judged from the list of pools.
 
 The kmem arena is most likely a bit more than this mentioned 4g as the machine seems to have 16gb?
 It should be the second entry of the output of "pmap 0".
 
 >   kva-12288   12288      35    0        0     2     0     2     2     0   inf    0
 >   kva-16384   16384      17    0        0     2     0     2     2     0   inf    0
 >   kva-20480   20480      84    0        0     7     0     7     7     0   inf    0
 >   kva-24576   24576       9    0        0     1     0     1     1     0   inf    0
 >   kva-28672   28672       3    0        0     1     0     1     1     0   inf    0
 >   kva-32768   32768       1    0        0     1     0     1     1     0   inf    0
 >   kva-36864   36864       3    0        0     1     0     1     1     0   inf    0
 >   kva-40960   40960     108    0        0    18     0    18    18     0   inf    0
 >   kva-49152   49152       1    0        0     1     0     1     1     0   inf    0
 >   kva-65536   65536   63924    0        0 15981     0 15981 15981     0   inf    0
 >   kva-8192    8192       52    0        0     2     0     2     2     0   inf    0
 
 ...
 I'm not aware of any pool that allocates from the 64k quantum cache so it doesn't surprise me that that pagedaemon/pool_drain
 isn't able to free anything.
 
 Kind regards,
 Lars

Prev by Date: Re: kern/54818: 9.0_RC1 pagedaemon spins
Next by Date: NetBSD Nightly Trouble Ticket Report
Previous by Thread: Re: kern/54818: 9.0_RC1 pagedaemon spins
Next by Thread: Re: kern/54818: 9.0_RC1 pagedaemon spins
Indexes:

Home | Main Index | Thread Index | Old Index