[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: NFS server mbuf leak
On Mon, Jan 05, 2009 at 04:33:29PM +0200, Antti Kantee wrote:
> > Today, a client in suspend got woken up, and the leak started again.
> > I took time to investigate a bit (investigations got stopped by
> > the reboot of the NFS client by the user):
> > - MBUFTRACE confirms that the leak is in the NFS code. It's a mbuf+cluster
> > leak.
> > - I suspect the error causing the leak is (from tcpdump)
> > "reply ok 36 access ERROR: Stale NFS file handle"
> > but I got about 700 of these in one minutes, for only 60 mbufs leaked.
> > So it's not one mbuf leak per reply. The only other reply type I've
> > seen is "reply ok 32 getattr ERROR: Stale NFS file handle", but only 5 of
> > them in one minute, for 60 mbufs leak.
> > - I've not seen this in normal operations, even if there's lots of requests
> > for deleted files. So it could be related to the partition manipulations
> > I did on the server. Note that the server got rebooted several times
> > between the partitions changes and the last occurence of the problem,
> > so it's not caused by something stale on the server.
> > I looked at the source but didn't see anything obvious. The fact that
> > there's not a 1 for 1 correspondance between replies and lost mbufs makes me
> > think that there's another parameter that I didn't find yet ...
> > Any idea where to look at ?
> Since it's netbsd-3, try if revs 1.80 *and* 1.139 of nfs_syscalls.c help.
thanks ! A new kernel is in place, but as I believe all systems that did have
the old disks mounted have been rebooted now, it'll be hard to make sure
this fixes the issue ...
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
NetBSD: 26 ans d'experience feront toujours la difference
Main Index |
Thread Index |