Subject: kern/25262: possible mbuf leak in UDP
To: None <gnats-bugs@gnats.NetBSD.org>
From: Frank Kardel <kardel@pip.acrys.com>
List: netbsd-bugs
Date: 04/20/2004 18:59:54
>Number: 25262
>Category: kern
>Synopsis: possible mbuf leak in UDP
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Apr 20 17:01:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator: Frank Kardel
>Release: NetBSD 2.0C
>Organization:
>Environment:
current as of 20040412-174655
System: NetBSD pip 2.0C NetBSD 2.0C (SYSPIP_ISDN) #1: Sun Apr 18 13:10:20 MEST 2004 kardel@pip:/src/NetBSD/netbsd/sys/arch/i386/compile/obj.i386/SYSPIP_ISDN i386
Architecture: i386
Machine: i386
>Description:
We had a Linux NFS client that was stalled on NFS. The server was out of mbufs after a short while and
increasing NMBCLUSTERS to 10240 extended the time to server network outage to about 0.5h.
netstat -m showed ever increasing data mbuf counts.
A reboot of the Linux machine lead to stable mbuf counts again.
A kernel with MBUFTRACE enabled and running while the Linux client was sending the mbuf eating
packets produced following mowner statistic:
Name Descr claims release delta c claim c rel delta ext clm ext rel delta
vlan1 rx 19- 19= 0 19- 19= 0 19- 19= 0
vlan1 tx 583159- 583159= 0 117862- 117862= 0 248254- 248254= 0
unix 17478- 17478= 0 14835- 14835= 0 15777- 15777= 0
tcp 1585- 1582= 3 0- 0= 0 0- 0= 0
tcp rx 219129- 219129= 0 219127- 219127= 0 219127- 219127= 0
tcp tx 137993- 137992= 1 693- 693= 0 697- 697= 0
udp 26722- 26722= 0 0- 0= 0 0- 0= 0
udp rx 3062668- 3060430= 2238 2214855- 2212617= 2238 2214855- 2212617= 2238
udp tx 837411- 837411= 0 689917- 689917= 0 689917- 689917= 0
internet rx 2173893- 2173893= 0 2168146- 2168146= 0 2168146- 2168146= 0
internet tx 1683193- 1683193= 0 7877- 7877= 0 7877- 7877= 0
internet 117- 117= 0 49- 49= 0 49- 49= 0
internet6 50- 50= 0 0- 0= 0 0- 0= 0
key 0- 0= 0 0- 0= 0 0- 0= 0
arp 879- 879= 0 619- 619= 0 619- 619= 0
route 132- 132= 0 0- 0= 0 0- 0= 0
lo0 4196- 4196= 0 0- 0= 0 0- 0= 0
ex1 rx 0- 0= 0 0- 0= 0 0- 0= 0
ex1 tx 0- 0= 0 0- 0= 0 0- 0= 0
ex0 rx 1616404- 1616404= 0 1616404- 1616404= 0 1616404- 1616404= 0
ex0 tx 2864510- 2864509= 1 580116- 580116= 0 1517915- 1517915= 0
sk0 rx 0- 0= 0 0- 0= 0 0- 0= 0
sk0 tx 0- 0= 0 0- 0= 0 0- 0= 0
nfs 2188996- 2188996= 0 979407- 979407= 0 2047847- 2047847= 0
unknown free 0- 0= 0 0- 0= 0 0- 0= 0
unknown data 5932754- 5932498= 256 1617279- 1617023= 256 1617279- 1617023= 256
unknown header 1022217- 1022217= 0 0- 0= 0 0- 0= 0
unknown soname 860514- 860514= 0 0- 0= 0 0- 0= 0
unknown soopts 9418- 9418= 0 0- 0= 0 0- 0= 0
unknown ftable 0- 0= 0 0- 0= 0 0- 0= 0
unknown control 38- 38= 0 0- 0= 0 0- 0= 0
unknown oobdata 0- 0= 0 0- 0= 0 0- 0= 0
revoked 0- 0= 0 0- 0= 0 0- 0= 0
There are still 2238 mbufs claimed by UDP. The client has long been rebooted and mbuf data counts are
stable now (even with the client generating requests). There is no data stuck in send/receive queues
according the netstat -an.
Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 10.0.2.14.65462 10.0.2.100.9100 SYN_SENT
tcp 0 0 10.0.2.14.143 10.0.2.72.65335 ESTABLISHED
tcp 0 0 10.0.2.14.143 10.0.2.73.33001 ESTABLISHED
tcp 0 256 10.0.2.14.22 10.0.2.72.65336 ESTABLISHED
tcp 0 0 *.631 *.* LISTEN
tcp 0 0 *.2000 *.* LISTEN
tcp 0 0 *.995 *.* LISTEN
tcp 0 0 *.110 *.* LISTEN
tcp 0 0 *.993 *.* LISTEN
tcp 0 0 *.143 *.* LISTEN
tcp 0 0 *.139 *.* LISTEN
tcp 0 0 *.445 *.* LISTEN
tcp 0 0 *.587 *.* LISTEN
tcp 0 0 *.25 *.* LISTEN
tcp 0 0 *.22 *.* LISTEN
tcp 0 0 *.2049 *.* LISTEN
tcp 0 0 *.1018 *.* LISTEN
tcp 0 0 *.1019 *.* LISTEN
tcp 0 0 *.1020 *.* LISTEN
tcp 0 0 *.1022 *.* LISTEN
tcp 0 0 *.111 *.* LISTEN
udp 0 0 *.631 *.*
udp 0 0 10.0.2.14.138 *.*
udp 0 0 10.0.2.14.137 *.*
udp 0 0 10.1.203.10.138 *.*
udp 0 0 10.1.203.10.137 *.*
udp 0 0 *.138 *.*
udp 0 0 *.137 *.*
udp 0 0 *.10080 *.*
udp 0 0 10.1.203.10.123 *.*
udp 0 0 127.0.0.1.123 *.*
udp 0 0 10.0.2.14.123 *.*
udp 0 0 *.123 *.*
udp 0 0 *.520 *.*
udp 0 0 *.2049 *.*
udp 0 0 *.1016 *.*
udp 0 0 *.1017 *.*
udp 0 0 *.65034 *.*
udp 0 0 *.65533 *.*
udp 0 0 *.1018 *.*
udp 0 0 *.1020 *.*
udp 0 0 *.1021 *.*
udp 0 0 *.111 *.*
Active Internet6 connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp6 0 0 *.2000 *.* LISTEN
tcp6 0 0 *.995 *.* LISTEN
tcp6 0 0 *.110 *.* LISTEN
tcp6 0 0 *.993 *.* LISTEN
tcp6 0 0 *.143 *.* LISTEN
tcp6 0 0 *.25 *.* LISTEN
tcp6 0 0 *.22 *.* LISTEN
tcp6 0 0 *.2049 *.* LISTEN
tcp6 0 0 *.1017 *.* LISTEN
tcp6 0 0 *.1021 *.* LISTEN
tcp6 0 0 *.111 *.* LISTEN
udp6 0 0 fe80::20a:5eff:f.123 *.*
udp6 0 0 fe80::1%lo0.123 *.*
udp6 0 0 ::1.123 *.*
udp6 0 0 fe80::20a:5eff:f.123 *.*
udp6 0 0 *.123 *.*
udp6 0 0 *.2049 *.*
udp6 0 0 *.1015 *.*
udp6 0 0 *.1019 *.*
udp6 0 0 *.* *.*
udp6 0 0 *.1022 *.*
udp6 0 0 *.111 *.*
tcp6 0 0 *.111 *.* LISTEN
udp6 0 0 fe80::20a:5eff:f.123 *.*
udp6 0 0 fe80::1%lo0.123 *.*
udp6 0 0 ::1.123 *.*
udp6 0 0 fe80::20a:5eff:f.123 *.*
udp6 0 0 *.123 *.*
udp6 0 0 *.2049 *.*
udp6 0 0 *.1015 *.*
udp6 0 0 *.1019 *.*
udp6 0 0 *.* *.*
udp6 0 0 *.1022 *.*
udp6 0 0 *.111 *.*
>How-To-Repeat:
Have a funny NFS client (sorry I didn't capture the pakets because I didn't
pinpoint the bad client before reboot) . Watch data mbufs not being freed.
Observe high numbers of mbufs being claimed by UDP.
>Fix:
? Check for mbuf leak. Rebooting a client doesn't seem to be a too good workaround.
>Release-Note:
>Audit-Trail:
>Unformatted: