tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: infinite recursion in m_split()
On Fri, Apr 03, 2009 at 01:22:30AM +0000, YAMAMOTO Takashi wrote:
> hi,
>
> > Hi,
> > While debugging NFS server issues I found a infinite recursion in m_split
> > when it's called with len0 == 0.
> > it's called this way from nfsrv_getstream() (and this is even documented
> > in the comment :):
> > /*
> > * Now get the record part.
> > *
> > * Note that slp->ns_reclen may be 0. Linux sometimes
> > * generates 0-length records.
> > */
> > if (slp->ns_cc == slp->ns_reclen) {
> > recm = slp->ns_raw;
> > slp->ns_raw = slp->ns_rawend = (struct mbuf *)0;
> > slp->ns_cc = slp->ns_reclen = 0;
> > } else if (slp->ns_cc > slp->ns_reclen) {
> > recm = slp->ns_raw;
> > m = m_split(recm, slp->ns_reclen, waitflag);
> >
> > Then m_split() calls m_split0, which, if (m0->m_flags & M_PKTHDR)
> > is true and (m->m_flags & M_EXT) is false, will call m_split() with
> > the same mbuf and the same m0. This is an infinite loop (well, infinite
> > until stack is exhausted).
>
> good catch.
Well, not so good. The conditions which cause this recursion should not
happen: If m == m0 (i.e. the loop at the start of m_split0() dind't
progress), and (m0->m_flags & M_PKTHDR) is true, then we either
have remain (= m0->m_len in this specific case) <= MHLEN or m0 has
an external storage. So the recursion happens when something wrong
already happened. The attached patch gives more details on how we get in
this situation:
m 0xc3bae000 m0 0xc3bae000 remain 1444 len_save -4 len0 0 len 0 m_len 1444
MHLEN 200
panic: m_split0
This mbuf obviously has a cluster attached (gdb on the core dump confirmed),
yet it doesn't have M_EXT set (the (m->m_flags & M_EXT) goto didn't
catch it, and gdb on the core dump also confirms). So the mbuf got partially
updated before m_split() I also guess m0->m_pkthdr.len is not supposed
to be negative, yet we have len_save as -4 (I've also seen -252 or -260).
So something is putting wrong values in the mbuf before calling m_split()
on it, but I don't know where it happens.
The printf() I added to check for len0 == 0 at the top of m_split()
fires only once, so I guess when slp->ns_reclen == 0 damage has already
been done to the associated mbuf; it could be related to this
specific condition.
Any idea very welcome.
--
Manuel Bouyer, LIP6, Universite Paris VI.
Manuel.Bouyer%lip6.fr@localhost
NetBSD: 26 ans d'experience feront toujours la difference
--
Home |
Main Index |
Thread Index |
Old Index