tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: mbuf cluster leak?
> WARNING: mclpool limit reached; increase kern.mbuf.nmbclusters
Welcome to the club.
I identified and fixed so many (potential) mbuf cluster leaks that I probably
forgot about most of them. You may browse through tech-net if you like.
The biggest one is in lltable_drop_entry_queue(), which is completely broken
since no-one ever increments la_numheld. Patch (to net/llatbl.c) attached.
The next largely suspicious code is in m_ext_free(), which I have mostly
rewritten (see attached). For that version to work, you need (or so I think)
(n)->m_ext_storage.ext_refcnt = 0; \
after
(n)->m_flags |= ((o)->m_flags & M_EXTCOPYFLAGS); \
in sys/kern/uipc_mbuf.c
I have those patches for -8 too in case that's relevant to you.
I think there are other (potential) leaks in IPF, but that can't be your
problem.
I also have (posted to tech-net) a (hacky) utility to dump kernel pools
(so you can analyze what's in those orphaned mbuf clusters).
And I have a large patchset adding intrumentation to the mbuf life cycle
(by writing magic values to m_pkthdr.pad0) so I could (at least for the code
paths relevant to me) see where an mbuf was last used.
As far as I remember, combining both let me find the ls_numheld bug.
Good luck!
Index: sys/net/if_llatbl.c
===================================================================
RCS file: /cvsroot/src/sys/net/if_llatbl.c,v
retrieving revision 1.35
diff -u -r1.35 if_llatbl.c
--- sys/net/if_llatbl.c 19 Nov 2022 08:00:51 -0000 1.35
+++ sys/net/if_llatbl.c 19 Feb 2025 14:44:39 -0000
@@ -322,6 +322,8 @@
LLE_WLOCK_ASSERT(lle);
pkts_dropped = 0;
+
+#if 0 /* XXX la_numheld isn't (properly) incremented */
while ((lle->la_numheld > 0) && (lle->la_hold != NULL)) {
next = lle->la_hold->m_nextpkt;
m_freem(lle->la_hold);
@@ -333,6 +335,15 @@
KASSERTMSG(lle->la_numheld == 0,
"la_numheld %d > 0, pkts_dropped %zd",
lle->la_numheld, pkts_dropped);
+#else
+ while (lle->la_hold != NULL) {
+ next = lle->la_hold->m_nextpkt;
+ m_freem(lle->la_hold);
+ lle->la_hold = next;
+ pkts_dropped++;
+ }
+ lle->la_numheld = 0;
+#endif
return (pkts_dropped);
}
static void
m_ext_free(struct mbuf *m)
{
struct mbuf *m1 = m->m_ext_ref; /* may well be m itself */
u_int refcnt;
KASSERT((m->m_flags & M_EXT) != 0);
KASSERT(MEXT_ISEMBEDDED(m1));
KASSERT((m1->m_flags & M_EXT) != 0);
KASSERT((m->m_flags & M_EXT_CLUSTER) ==
(m1->m_flags & M_EXT_CLUSTER));
if (__predict_false(m->m_type == MT_FREE)) {
panic("mbuf %p already freed", m);
}
if (__predict_true(m1->m_ext_storage.ext_refcnt == 1)) {
refcnt = m1->m_ext_storage.ext_refcnt = 0;
} else {
#ifndef __HAVE_ATOMIC_AS_MEMBAR
membar_release();
#endif
refcnt = atomic_dec_uint_nv(&m1->m_ext_storage.ext_refcnt);
}
if (refcnt == 0) {
#ifndef __HAVE_ATOMIC_AS_MEMBAR
membar_acquire();
#endif
if ((m1->m_flags & M_EXT_CLUSTER) != 0) {
pool_cache_put_paddr(mcl_cache,
m1->m_ext_storage.ext_buf, m1->m_ext_storage.ext_paddr);
m1->m_type = MT_FREE;
pool_cache_put(mb_cache, m1);
} else if (m1->m_ext_storage.ext_free) {
(*m1->m_ext_storage.ext_free)(m1,
m1->m_ext_storage.ext_buf, m1->m_ext_storage.ext_size,
m1->m_ext_storage.ext_arg);
/*
* m1 is already freed by the ext_free callback.
*/
} else {
free(m1->m_ext_storage.ext_buf, 0);
m1->m_type = MT_FREE;
pool_cache_put(mb_cache, m1);
}
}
if (__predict_false(m1 != m)) {
KASSERT(m->m_ext_storage.ext_refcnt == 0);
m->m_type = MT_FREE;
m->m_data = NULL;
pool_cache_put(mb_cache, m);
}
}
Home |
Main Index |
Thread Index |
Old Index