Subject: kern/9792: Kernel leaking buffers?
To: None <gnats-bugs@gnats.netbsd.org>
From: Richard Earnshaw <rearnsha@cambridge.arm.com>
List: netbsd-bugs
Date: 04/05/2000 19:56:43
>Number: 9792
>Category: kern
>Synopsis: Kernel leaking buffers?
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Apr 05 13:19:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Richard Earnshaw
>Release: NetBSD-current 2000/03/27<NetBSD-current source date>
>Organization:
ARM
--
>Environment:
System: NetBSD shark1 1.4X NetBSD 1.4X (SHARK) #52: Mon Mar 27 17:55:21 BST 2000 rearnsha@shark1:/usr/src/sys/arch/arm32/compile/SHARK arm32
shark1:testsuite [537] $ df -k
Filesystem 1K-blocks Used Avail Capacity Mounted on
/dev/wd0a 194983 37479 147754 20% /
/dev/wd0e 2028219 460698 1466110 23% /usr
mfs:49 198183 22 188251 0% /tmp
/dev/wd0f 2028219 573062 1353746 29% /usr/build
/dev/wd0g 2028219 1583868 342940 82% /work
/dev/wd0h 2925218 834382 1944575 30% /xx
(/usr/build and /xx are mounted with soft-update enabled; all others
are non-soft-update)
>Description:
Sorry this report is about a kernel that is a week old, but it takes
that long before the problem starts to manifest itself :-(
Since I started using a kernel with soft-updates and trickle sync all
my kernels have started to thrash the disk horribly after about a week
of uptime. I can't be entirely sure where the failure is (hints of
what information you require would be appreciated), but "systat vmstat"
is showing that the sys-cache hit ratio has dropped off alarmingly
(when the system first comes up I'm typically seeing 98->99% hits on
approx 5k->10k calls; after a while this drops dramatically to about
50% hits on 500->2k calls). In addition, the disk seems to start
making a very large number of head movements (I can hear it).
I'm not sure if this is directly related to the soft-updates code
or trickle syncing (or either unrealated). But it still occurs
even if accessing only discs that aren't mounted with soft-updates.
I haven't tried a kernel built without soft-updates.
Final note: the machine only has 32M of memory, so I guess that if it
is a buffer problem, I'm likely to start seeing problems sooner than
folks with massive amounts of RAM (and hence system buffers of various
types).
>How-To-Repeat:
Leave a machine running for a week or so.
>Fix:
No idea.
>Release-Note:
>Audit-Trail:
>Unformatted: