Subject: kern/9792: Kernel leaking buffers?
To: None <gnats-bugs@gnats.netbsd.org>
From: Richard Earnshaw <rearnsha@cambridge.arm.com>
List: netbsd-bugs
Date: 04/05/2000 19:56:43
>Number:         9792
>Category:       kern
>Synopsis:       Kernel leaking buffers?
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Apr 05 13:19:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Richard Earnshaw
>Release:        NetBSD-current 2000/03/27<NetBSD-current source date>
>Organization:
ARM
-- 
>Environment:
	
System: NetBSD shark1 1.4X NetBSD 1.4X (SHARK) #52: Mon Mar 27 17:55:21 BST 2000 rearnsha@shark1:/usr/src/sys/arch/arm32/compile/SHARK arm32

	shark1:testsuite [537] $ df -k
	Filesystem  1K-blocks     Used    Avail Capacity  Mounted on
	/dev/wd0a               194983    37479   147754    20%    /
	/dev/wd0e              2028219   460698  1466110    23%    /usr
	mfs:49                  198183       22   188251     0%    /tmp
	/dev/wd0f              2028219   573062  1353746    29%    /usr/build
	/dev/wd0g              2028219  1583868   342940    82%    /work
	/dev/wd0h              2925218   834382  1944575    30%    /xx

	(/usr/build and /xx are mounted with soft-update enabled; all others
	are non-soft-update)

>Description:

	Sorry this report is about a kernel that is a week old, but it takes
	that long before the problem starts to manifest itself :-(

	Since I started using a kernel with soft-updates and trickle sync all
	my kernels have started to thrash the disk horribly after about a week
	of uptime.  I can't be entirely sure where the failure is (hints of
	what information you require would be appreciated), but "systat vmstat"
	is showing that the sys-cache hit ratio has dropped off alarmingly 
	(when the system first comes up I'm typically seeing 98->99% hits on
	approx 5k->10k calls; after a while this drops dramatically to about 
	50% hits on 500->2k calls).  In addition, the disk seems to start
	making a very large number of head movements (I can hear it).

	I'm not sure if this is directly related to the soft-updates code
	or trickle syncing (or either unrealated).  But it still occurs
	even if accessing only discs that aren't mounted with soft-updates.

	I haven't tried a kernel built without soft-updates.

	Final note: the machine only has 32M of memory, so I guess that if it
	is a buffer problem, I'm likely to start seeing problems sooner than
	folks with massive amounts of RAM (and hence system buffers of various
	types).

	
>How-To-Repeat:
	
	Leave a machine running for a week or so.
	
>Fix:
	No idea.
	
>Release-Note:
>Audit-Trail:
>Unformatted: